Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaniko fails to execute multiple builds in same container #2793

Open
sam0392in opened this issue Oct 12, 2023 · 9 comments
Open

Kaniko fails to execute multiple builds in same container #2793

sam0392in opened this issue Oct 12, 2023 · 9 comments
Labels
area/multi-stage builds issues related to kaniko multi-stage builds feat/cleanup feat/concurrency kind/bug Something isn't working possible-dupe priority/p2 High impact feature/bug. Will get a lot of users happy

Comments

@sam0392in
Copy link

sam0392in commented Oct 12, 2023

Actual behavior
Kaniko fails to execute sequential builds in the same container. After the first build, the second build fails to start the kaniko command to build the image.

Expected behavior
Kaniko should not fail during multiple image builds in same container.

To Reproduce
Steps to reproduce the behavior:

  • Create a Kaniko Pod with image gcr.io/kaniko-project/executor:debug .
  • Go inside the pod and create a simple Dockerfile.
FROM ubuntu
apt-get update -y
  • Now run Kaniko command to create image from this Dockerfile.
/kaniko/executor \
-f ./Dockerfile -c . \
--dockerfile Dockerfile \
--destination=<YOUR IMAGE REGISTRY>:test_1.0
  • First execution will work perfectly.
  • Now within the same container again run the same command with some different tag
/kaniko/executor \
-f ./Dockerfile -c . \
--dockerfile Dockerfile \
--destination=<YOUR IMAGE REGISTRY>:test_1.1
  • Command will fail this time with an error
ERROR: Process exited immediately after creation. See output below

Additional Information

  • Dockerfile
FROM ubuntu
apt-get update -y
  • Kaniko Image image gcr.io/kaniko-project/executor:debug

Fix/Workaround

  • Kaniko as it seems like, is meant for a single execution and not reusing the same container for multiple image builds.
  • At the end of execution, kaniko removes workspace directory which makes it difficult for the next image build to execute image creation in the same container.

The workaround was to:

  • Explicitly create workspace directory at the end of each build execution which makes the same container ready for the next build. mkdir -p /workspace
  • Cleaning up the kaniko executor workspace by adding --cleanup arg.
  • Remove dependencies of old build (symlinks) by rm -rf /kaniko/0
      /kaniko/executor \
      -f ./Dockerfile -c . \
      --dockerfile Dockerfile \
      --destination=<YOUR IMAGE REGISTRY>:test_1.1

      rm -rf /kaniko/0
      mkdir -p /workspace

Expectation

  • If some flag can be introduced to avoid /workspace directory deletion after the execution of kaniko build command. something like --reuse-executor=true
  • If rm -rf /kaniko/0 can be handled from --cleanup flag itself.

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
  • - [*]
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@sam0392in sam0392in changed the title Kaniko fails to execute Multiple Builds in same container Kaniko fails to execute multiple builds in same container Oct 12, 2023
@plachta11b
Copy link

plachta11b commented Oct 17, 2023

I was not able to reproduce this. Used "Kaniko version : v1.16.0" from debug image docker run -it --entrypoint="" gcr.io/kaniko-project/executor:debug /bin/sh.

I used a slightly modified Dockerfile:

FROM ubuntu
RUN apt-get update -y

EDIT: Maybe it is caused by mem/disk space?

@mama-wk
Copy link

mama-wk commented Oct 19, 2023

I'm experiencing the same problem, reproducible like this:

  1. Start the container with docker run -it --rm --entrypoint="" -v ./:/tmp gcr.io/kaniko-project/executor:debug /bin/sh
  2. Create a Dockerfile:
FROM node:18-bookworm

RUN apt-get update \
    && apt-get install -y wget gnupg1 ca-certificates procps libxss1 \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub > linux_signing_key.pub \
    && install -D -o root -g root -m 644 linux_signing_key.pub /etc/apt/keyrings/linux_signing_key.pub \
    && sh -c 'echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/linux_signing_key.pub] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list' \
    && apt-get update \
    && apt-get install -y google-chrome-stable git curl unzip python3 python3-venv libnss3-dev \
    && rm -rf /var/lib/apt/lists/* \
    && wget --quiet https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh -O /usr/sbin/wait-for-it.sh \
    && chmod +x /usr/sbin/wait-for-it.sh
  1. Start image build with: executor -f Dockerfile --destination test-img-1 --no-push
    This works fine.
  2. Start image build again with: executor -f Dockerfile --destination test-img-2 --no-push
    The second build fails with this error:
...
Need to get 1171 kB of archives.
After this operation, 5047 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bookworm/main amd64 gnupg1 amd64 1.4.23-1.1+b1 [601 kB]
Get:2 http://deb.debian.org/debian bookworm/main amd64 gnupg1-l10n all 1.4.23-1.1 [553 kB]
Get:3 http://deb.debian.org/debian bookworm/main amd64 libxss1 amd64 1:1.2.3-1 [17.8 kB]
Fetched 1171 kB in 0s (7344 kB/s)    
debconf: delaying package configuration, since apt-utils is not installed
dpkg: unrecoverable fatal error, aborting:
 unknown system group 'messagebus' in statoverride file; the system group got removed
before the override, which is most probably a packaging bug, to recover you
can remove the override manually with dpkg-statoverride
E: Sub-process /usr/bin/dpkg returned an error code (2)
error building image: error building stage: failed to execute command: waiting for process to exit: exit status 100

@aaron-prindle aaron-prindle added feat/concurrency area/multi-stage builds issues related to kaniko multi-stage builds feat/cleanup kind/bug Something isn't working priority/p2 High impact feature/bug. Will get a lot of users happy possible-dupe labels Oct 19, 2023
@ganeshgk
Copy link

ganeshgk commented Nov 21, 2023

@mama-wk Had similar issue, so running sed -i '/messagebus/d' /var/lib/dpkg/statoverride this before re-running the executor is necessary to solve this, but this is only part of the problem, what I recognized was any installation we do as part of Dockerfile instruction gets executed directly in the kaniko conatiner, so if a package already exists in the kaniko executor image already, the build will fail. for example - I customized the executor to include aws cli in the executor, but if I further use this executor for building an image from a dockerfile that has instruction of installing the aws cli, it fails.
if below instruction is present in a Dockerfile


RUN cd /tmp && \
    curl -sk "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \
    unzip awscliv2.zip && \
    ./aws/install 

that I am building with the custom executor with pre-installed aws cli, it fails with

  ./aws/install
Found preexisting AWS CLI installation: /usr/local/aws-cli/v2/current. Please rerun install script with --update flag

Is this the expected behavior? Atleast for me it seems very strange I couldn't find any explanation as to why this happens. :(
I do not know if I should raise a separate issue for this or not.

@ricardllop
Copy link

ricardllop commented Feb 1, 2024

Hello after finding this issue here and trying a lot of things to find a generic fix for my case, I found what I think solves most my error cases. In my case this solved all the failing builds with the different dockerfiles of around 15 projects (not all were failing but the ones with dockerfiles with more stages were more prone to fail).

My use case of kaniko is inside a jenkins pipeline that is using kubernetes plugin to run jobs inside kubernetes agent pods. Those agents have defined 1 single kaniko container and my need was to build the image twice with that single kaniko container, once as a tar to scan it with Trivy (a tool to scan containers) and after some quality checks are met use again the kaniko container to just build the image again and upload it to ECR.

My solution was adding this to my first call of building the image as a tar: && rm -rf /kaniko/*[0-9]* && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Call ending like this.

/kaniko/executor -f 'pwd'/docker/Dockerfile -c 'pwd' --tar-path='pwd'/image.tar --single-snapshot --no-push --destination=image --cleanup && rm -rf /kaniko/*[0-9]* && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Not a huge kaniko user myself but found this /kaniko directory was filled with some files after the 1st execution as some people in this thread mentioned. those files were messing the next execution. Those commands after the 1st build remove those problematic files and second execution works as a charm.

Hope this helps other people that find this issue. Thanks.

@gschurck
Copy link

gschurck commented May 16, 2024

My solution was adding this to my first call of building the image as a tar: && rm -rf /kaniko/[0-9] && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Thanks a lot for your feedback !
I added --cleanup && rm -rf /kaniko/*[0-9]* && rm -rf /kaniko/Dockerfile && mkdir -p /workspace to my Kaniko command and it finally fixed the issue for me too.

@hAislt
Copy link

hAislt commented Jun 4, 2024

Thanks for the workaround! This was really helpful.
In my case I had to add needed commands for removal or the workspace (rm)
I had only 'busybox/cat' as command in my pod.yaml

I would also appreciate it if there is a flag which does the thing.

@ricardllop sorry for directly tagging you but we have similar environment, do you also stumble accross inconsistencies?
like not every run is successful and fails with

/durable-13df746c/script.sh.copy: line 9: rm: not found

@ricardllop
Copy link

Thanks for the workaround! This was really helpful. In my case I had to add needed commands for removal or the workspace (rm) I had only 'busybox/cat' as command in my pod.yaml

I would also appreciate it if there is a flag which does the thing.

@ricardllop sorry for directly tagging you but we have similar environment, do you also stumble accross inconsistencies? like not every run is successful and fails with

/durable-13df746c/script.sh.copy: line 9: rm: not found

It was consistent for our environment, although we have shifted into using kaniko to build as tar --> scan the tar with trivy --> then crane to push the tar to ECR

@dsebastien
Copy link

We have faced this issue as well, when trying to invoke Kaniko twice for building an Alpine image with different versions of Maven/Amazon Corretto. We downloaded/uncompressed/moved Maven files to the Maven home folder. It worked for the first execution, but failed when moving the files during the second execution because files were already present.

IMHO the cleanup flag should endure that we get a clean slate between executions and that there are no leftovers from previous runs.

@gpongelli
Copy link

My solution was adding this to my first call of building the image as a tar: && rm -rf /kaniko/[0-9] && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Thanks @ricardllop , this helped me with reusing same kaniko container in jenkins pipeline, even if I've already used --cleanup from this discussion.
The only difference I've on my pipeline executed in kubernetes is that I've to run those commands separately (as described by @sam0392in workaround), otherwise it claims of an unrecognized option (eve if I cannot find it in rm calls):

11:31:22  + rm -rf /kaniko/0 /kaniko/1243507711 /kaniko/1267127570 /kaniko/1291417459 /kaniko/1811132102 /kaniko/2122718679 /kaniko/232121738 /kaniko/2419771487 /kaniko/2454894120 /kaniko/2528370616 /kaniko/2550122943 /kaniko/266100005 /kaniko/2765341715 /kaniko/2797314277 /kaniko/2823469896 /kaniko/3081883165 /kaniko/3082013316 /kaniko/3120064069 /kaniko/3237801189 /kaniko/3407385118 /kaniko/346726080 /kaniko/3472971071 /kaniko/3650347770 /kaniko/3699928372 /kaniko/4014459482 /kaniko/4095587153 /kaniko/4141075137 /kaniko/4176115670 /kaniko/452533812 /kaniko/575631331 /kaniko/70706009 /kaniko/71760931 '&&' rm -rf /kaniko/Dockerfile '&&' mkdir -p /workspace
11:31:22  rm: unrecognized option: p

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-stage builds issues related to kaniko multi-stage builds feat/cleanup feat/concurrency kind/bug Something isn't working possible-dupe priority/p2 High impact feature/bug. Will get a lot of users happy
Projects
None yet
Development

No branches or pull requests

10 participants