Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible conflict running circleci/docker-gc container due to duplicate network setting #171

Closed
kelvintaywl opened this issue Jun 19, 2023 · 1 comment

Comments

@kelvintaywl
Copy link
Contributor

kelvintaywl commented Jun 19, 2023

As noted by a customer and a few of our own Server installations (CircleCI Support), we notice the docker-gc container in the Nomad node are failing with the following error seen.

docker-gc-start.rc[32908]: 2.0: Pulling from circleci/docker-gc
docker-gc-start.rc[32908]: Digest: sha256:3dc0e2dc1161cee808d7b76877b1de001bf8a5468ceaeb14f965e2b99c40f4bd
docker-gc-start.rc[32908]: Status: Image is up to date for circleci/docker-gc:2.0
docker-gc-start.rc[32908]: docker.io/circleci/docker-gc:2.0
docker-gc-start.rc[32926]: Error: No such container: docker-gc
docker-gc-start.rc[32937]: docker-gc
docker-gc-start.rc[32948]: docker: network-scoped aliases are only supported for user-defined networks.
docker-gc-start.rc[32948]: See 'docker run --help'.
systemd[1]: docker-gc.service: Main process exited, code=exited, status=125/n/a
systemd[1]: docker-gc.service: Failed with result 'exit-code'.
systemd[1]: docker-gc.service: Scheduled restart job, restart counter is at 5.
systemd[1]: Stopped Docker garbage collector.
systemd[1]: docker-gc.service: Start request repeated too quickly.
systemd[1]: docker-gc.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Docker garbage collector.
How to retrieve the logs above?

via journalctl:

$ journalctl -u docker-gc.service &> docker-gc.log
$ journalctl -u docker.service &> docker.log

After some internal discussion, I think there may have been a misconfiguration in the docker run ... command for this docker-gc container.
Namely, there looks to be a duplicate network setting (both --net and --network does the same thing), and the --network-alias is failing since it cannot be set to the host network (which was set by --net).

Code area impacted:

  • docker run \
    --rm \
    --interactive \
    --name "docker-gc" \
    --privileged \
    --net=host \
    --userns=host \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    --volume /var/lib/docker:/var/lib/docker:ro \
    --volume docker-gc:/state \
    --network=ci-privileged \
    --network-alias=docker-gc.internal.circleci.com \
    "circleci/docker-gc:2.0" \
    -threshold-percent 50
  • docker run \
    --rm \
    --interactive \
    --name "docker-gc" \
    --privileged \
    --net=host \
    --userns=host \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    --volume /var/lib/docker:/var/lib/docker:ro \
    --volume docker-gc:/state \
    --network=ci-privileged \
    --network-alias=docker-gc.internal.circleci.com \
    "circleci/docker-gc:2.0" \
    -threshold-percent 50

Additional references:

@kelvintaywl kelvintaywl changed the title Possible conflict running docker-gc container due to duplicate network setting Possible conflict running circleci/docker-gc container due to duplicate network setting Jun 22, 2023
@kelvintaywl
Copy link
Contributor Author

kelvintaywl commented Jun 22, 2023

I was able to reproduce the same error seen (i.e., docker: network-scoped aliases are only supported for user-defined networks):
https://app.circleci.com/pipelines/github/kelvintaywl-cci/docker-gc-repro/4/workflows/e9b019b8-a108-4fa4-8f2c-9d1f327a95bb/jobs/8

The proposed fix of removing --net=host worked in that the circleci/docker-gc container successfully ran without the error above:
https://app.circleci.com/pipelines/github/kelvintaywl-cci/docker-gc-repro/4/workflows/e9b019b8-a108-4fa4-8f2c-9d1f327a95bb/jobs/7

codebase (public) which I try to reproduce the same setting:
https://github.com/kelvintaywl-cci/docker-gc-repro/blob/main/.circleci/config.yml

will be happy to follow up:

  1. fork this server-terraform, and apply the changes to Terraform stack
  2. reapply the server-terraform Terraform stack on a CircleCI Server
  3. confirm docker-gc process works with journalctl
  4. make PR here to close this issue

nanophate added a commit to nanophate/server-terraform that referenced this issue Jun 23, 2023
nanophate added a commit to nanophate/server-terraform that referenced this issue Jun 23, 2023
kelvintaywl added a commit to kelvintaywl/server-terraform that referenced this issue Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant