You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
INFO[0001] Starting node 'k3d-multi-2-server-1'
DEBU[0002] Truncated 2024-06-11 08:02:45.185124838 +0000 UTC to 2024-06-11 08:02:45 +0000 UTC
DEBU[0002] Waiting for node k3d-multi-2-server-1 to get ready (Log: 'k3s is up and running')
TRAC[0002] NodeWaitForLogMessage: Node 'k3d-multi-2-server-1' waiting for log message 'k3s is up and running' since '2024-06-11 08:02:45 +0000 UTC'
docker logs k3d-multi-2-server-1 2>&1 | less
time="2024-06-11T08:01:40Z" level=info msg="Failed to test data store connection: context deadline exceeded"
{"level":"warn","ts":"2024-06-11T08:01:40.922826Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"7093034
7b5d5e687","rtt":"0s","error":"dial tcp 172.20.0.6:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:40.922907Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"369e2fa
3a81b859c","rtt":"0s","error":"dial tcp 172.20.0.7:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:40.922921Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"70930347b5d
5e687","rtt":"0s","error":"dial tcp 172.20.0.6:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:40.922926Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"ef68f96
2f1358108","rtt":"0s","error":"dial tcp 172.20.0.5:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:40.922931Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"369e2fa3a81
b859c","rtt":"0s","error":"dial tcp 172.20.0.7:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:40.952634Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"ef68f962f13
58108","rtt":"0s","error":"dial tcp 172.20.0.5:2380: connect: no route to host"}
Error from server (ServiceUnavailable): apiserver not ready
time="2024-06-11T08:01:42Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
{"level":"info","ts":"2024-06-11T08:01:43.904177Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 is starting a new election at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904256Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 became pre-candidate at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904278Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 received MsgPreVoteResp from b37ae33eccf57701 at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904309Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 [logterm: 3, index: 27609] sent MsgPreVote request to 369e2fa3a81b859c at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904322Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 [logterm: 3, index: 27609] sent MsgPreVote request to 59187da497ed4d13 at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904326Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 [logterm: 3, index: 27609] sent MsgPreVote request to 70930347b5d5e687 at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.90433Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 [logterm: 3, index: 27609] sent MsgPreVote request to ef68f962f1358108 at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.904877Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 received MsgPreVoteResp from 59187da497ed4d13 at term 3"}
{"level":"info","ts":"2024-06-11T08:01:43.9049Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 has received 2 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-06-11T08:01:44.19295Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b37ae33eccf57701 [logterm: 3, index: 27609, vote: 59187da497ed4d13] cast MsgPreVote for 59187da497ed4d13 [logterm: 3, index: 27609] at term 3"}
Error from server (ServiceUnavailable): apiserver not ready
{"level":"warn","ts":"2024-06-11T08:01:45.195281Z","caller":"etcdserver/server.go:2085","msg":"failed to publish local member to cluster through raft","local-member-id":"b37ae33eccf57701","local-member-attributes":"{Name:k3d-multi-2-server-1-41c018be ClientURLs:[https://172.20.0.4:2379]}","request-path":"/0/members/b37ae33eccf57701/attributes","publish-timeout":"15s","error":"etcdserver: request timed out"}
{"level":"warn","ts":"2024-06-11T08:01:45.923723Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"369e2fa3a81b859c","rtt":"0s","error":"dial tcp 172.20.0.7:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:45.923836Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"70930347b5d5e687","rtt":"0s","error":"dial tcp 172.20.0.6:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:45.923792Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"369e2fa3a81b859c","rtt":"0s","error":"dial tcp 172.20.0.7:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:45.923829Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"70930347b5d5e687","rtt":"0s","error":"dial tcp 172.20.0.6:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:45.923852Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"ef68f962f1358108","rtt":"0s","error":"dial tcp 172.20.0.5:2380: connect: no route to host"}
{"level":"warn","ts":"2024-06-11T08:01:45.952734Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"ef68f962f13
58108","rtt":"0s","error":"dial tcp 172.20.0.5:2380: connect: no route to host"}
Additional information
This issue is similar to #452, only to have more servers.
I tried to run the test_multi_master_start_stop.sh e2e test with the change that increases server number from 3 to 5. It also failed.
make e2e E2E_INCLUDE=test_multi_master_start_stop
Running e2e tests
LOG_LEVEL="WARN" E2E_INCLUDE="test_multi_master_start_stop" E2E_EXCLUDE="" E2E_EXTRA="" E2E_RUNNER_START_TIMEOUT=10 E2E_HELPER_IMAGE_TAG="" E2E_KEEP="" E2E_PARALLEL="" E2E_DIND_VERSION="" E2E_K3S_VERSION="" E2E_FAIL_FAST="" tests/dind.sh "5.6.3-30-g5028a65c"
2024-06-11 08:54:13 UTC [DIND] >>> Building container image k3d:5.6.3-30-g5028a65c-dind
sha256:188161529a2cad1d30f050b199c8ff8f2cdd636b4cd08d13eaf49e7f6df24a3a
2024-06-11 08:55:29 UTC [DIND] >>> Starting E2E Test Runner container k3d-e2e-runner-240611165529
truetruetruetruetruetrue
/bin/bash: line 1: go: command not found
/bin/bash: line 1: go: command not found
/bin/bash: line 1: golangci-lint: command not found
Building docker image ghcr.io/k3d-io/k3d-proxy:5.6.3-30-g5028a65c
DOCKER_BUILDKIT=1 docker build --quiet --no-cache proxy/ -f proxy/Dockerfile -t ghcr.io/k3d-io/k3d-proxy:5.6.3-30-g5028a65c
Building docker image ghcr.io/k3d-io/k3d-tools:5.6.3-30-g5028a65c
DOCKER_BUILDKIT=1 docker build --quiet --no-cache tools/ -f tools/Dockerfile -t ghcr.io/k3d-io/k3d-tools:5.6.3-30-g5028a65c --build-arg GIT_TAG=v5.6.3-30-g5028a65c
sha256:8451944beeefaedaa50a1bced69ccc4abb22dd2d3cbba42558614429cf02b74f
sha256:2f7ee9fe38e4e45b6069f6a358aaee2719b8a2511c79a47c48cb2a9004f1f774
Start time outside runner: Tue Jun 11 16:56:45 CST 2024
2024-06-11 08:56:45 UTC [Runner] >>> Checking test environment...
arch: aarch64
cgroupdriver: cgroupfs
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
infoname: 498900b9097b
name: docker
os: Alpine Linux v3.19 (containerized)
ostype: linux
version: 25.0.3
2024-06-11 08:56:45 UTC [Runner] >>> Preparing filesystem and environment...
2024-06-11 08:56:45 UTC [Runner] >>> Start time inside runner: Tue Jun 11 08:56:45 UTC 2024
2024-06-11 08:56:45 UTC [Runner] >>> ######################################
2024-06-11 08:56:45 UTC [Runner] >>> # BASIC TESTS (/src/tests/test_*.sh) #
2024-06-11 08:56:45 UTC [Runner] >>> ######################################
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_basic *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_config_file *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_config_file_from_stdin *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_config_file_migration *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_config_with_overrides *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_full_lifecycle *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_ipam *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_loadbalancer *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_memory_limits *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_multi_master *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_node_edit *****
2024-06-11 08:56:45 UTC [Runner] >>> ***** Skipping test_registry *****
2024-06-11 08:56:45 UTC [Runner] >>> Running 1 tests in 1 batches à max. 4 tests.
2024-06-11 08:56:45 UTC [Runner] >>> Starting test batch #1/1...
2024-06-11 08:56:45 UTC [Runner] >>> ***** Running test_multi_master_start_stop *****
2024-06-11 09:05:13 UTC [Runner] >>> WARN: FAILED LOGS: /root/testoutput/test_multi_master_start_stop.sh.log.failed
2024-06-11 09:05:13 UTC [Runner] >>> FAILED -> /root/testoutput/test_multi_master_start_stop.sh.log.failed
2024-06-11 09:05:13 UTC [Runner] >>> ***************************************************
2024-06-11 09:05:13 UTC [Runner] >>> *** Failed Test test_multi_master_start_stop
2024-06-11 09:05:13 UTC [Runner] >>> ***************************************************
2024-06-11 08:56:45 UTC [Test | multi-server-start-stop | ] >>> Creating cluster multiserverstartstop ...
kubectl cluster-info
2024-06-11 08:58:38 UTC [Test | multi-server-start-stop | ] >>> Checking that we have access to the cluster...
/root/.config/k3d/kubeconfig-multiserverstartstop.yaml
Kubernetes control plane is running at https://0.0.0.0:38773
CoreDNS is running at https://0.0.0.0:38773/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://0.0.0.0:38773/api/v1/namespaces/kube-system/services/https:metrics-server:https/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
2024-06-11 08:58:38 UTC [Test | multi-server-start-stop | ] >>> cluster multiserverstartstop is reachable
2024-06-11 08:58:38 UTC [Test | multi-server-start-stop | ] >>> Sleeping for 5 seconds to give the cluster enough time to get ready...
2024-06-11 08:58:43 UTC [Test | multi-server-start-stop | ] >>> Checking that we have 3 server nodes online...
/root/.config/k3d/kubeconfig-multiserverstartstop.yaml
2024-06-11 08:58:43 UTC [Test | multi-server-start-stop | ] >>> cluster multiserverstartstop has 5 nodes, as expected
2024-06-11 08:58:43 UTC [Test | multi-server-start-stop | ] >>> Stopping cluster...
2024-06-11 08:59:08 UTC [Test | multi-server-start-stop | ] >>> Waiting for a bit...
2024-06-11 08:59:13 UTC [Test | multi-server-start-stop | ] >>> Restarting cluster (time: 2024-06-11 08:59:13 UTC)...
FATA[2024-06-11T09:05:13Z] Failed to start server k3d-multiserverstartstop-server-1: Node k3d-multiserverstartstop-server-1 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-multiserverstartstop-server-1': stopped returning log lines: context deadline exceeded
2024-06-11 09:05:13 UTC [Test | multi-server-start-stop | ] >>> failed to restart cluster (timeout 360s)
2024-06-11 09:05:13 UTC [Test | multi-server-start-stop | ] >>> FATAL: Test | multi-server-start-stop |:test failed
2024-06-11 09:05:13 UTC [Runner] >>> FINISHED BASIC TESTS
> Total: 13
> Run: 1
> Skipped: 12
> Passed: 0
> Failed: 1
2024-06-11 09:05:13 UTC [Runner] >>> WARN: Failed BASIC TESTS: 1
2024-06-11 09:05:13 UTC [DIND] >>> Cleaning up test container k3d-e2e-runner-240611165529
498900b9097bed1974a79b6676f78c59e7ff32dba38d7bbb927ef1f7cd33c0cf
make: *** [e2e] Error 1
I have skimmed the patch in #467 and suspect the cause is not enough nodes to reach quorum. Because server number is 5, quorum is 3. And when starting the second node, etcd can't get enough vote, so the start process hungs.
What did you do
How was the cluster created?
k3d cluster create multi-2 --servers 5
What did you do afterwards?
k3d cluster stop multi-2
k3d cluster start multi-2
What did you expect to happen
cluster starts normally
Screenshots or terminal output
the process stucks here:
docker logs k3d-multi-2-server-1 2>&1 | less
Additional information
This issue is similar to #452, only to have more servers.
I tried to run the
test_multi_master_start_stop.sh
e2e test with the change that increases server number from 3 to 5. It also failed.make e2e E2E_INCLUDE=test_multi_master_start_stop
I have skimmed the patch in #467 and suspect the cause is not enough nodes to reach quorum. Because server number is 5, quorum is 3. And when starting the second node, etcd can't get enough vote, so the start process hungs.
Which OS & Architecture
MacOS with Docker Desktop
Which version of
k3d
Which version of docker
docker version
The text was updated successfully, but these errors were encountered: