Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot connect to Kubernetes server failed to decode watch event #532

Closed
javefang opened this issue Jul 14, 2016 · 18 comments · Fixed by #836
Closed

Cannot connect to Kubernetes server failed to decode watch event #532

javefang opened this issue Jul 14, 2016 · 18 comments · Fixed by #836

Comments

@javefang
Copy link

javefang commented Jul 14, 2016

Hi,

I was using Traefik v1.0.0 with Kubernetes 1.3.0. Randomly I'm getting invalid character errors and it crashes Traefik after a few failures. Any one experiencing the same problem?

time="2016-07-14T10:58:09Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2928703\" : invalid character '\\n' in string literal" 
time="2016-07-14T10:19:06Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2924878\" : invalid character 'a' looking for beginning of value" 
time="2016-07-14T09:00:01Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2916459\" : invalid character 'o' looking for beginning of value" 
time="2016-07-14T08:15:02Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2911726\" : invalid character '}' looking for beginning of value" 

Thanks!

@jonaz
Copy link
Contributor

jonaz commented Jul 14, 2016

If you manually start a pod. For example a busy box using:
kubectl run -i --tty busybox --image=busybox --restart=Never

then manually try to fetch the url:

wget "https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2928703"

and check the file with cat or less?

I do think you should install dns clusteraddon and use https://kubernetes:443 url for the api. Or is 10.254.0.0/16 your service range?

@javefang
Copy link
Author

I do have the dns addon. 10.254.0.1 is the service endpoint for the apiserver in my case. I've set up a pod that continuously monitors the strings out of the endpoint watch URL and report back when I see an error.

@emilevauge
Copy link
Member

Maybe the API evolved in k8s 1.3. We will have a look at this.
Other than weird logs, everything is working right ?
/cc @errm

@emilevauge emilevauge added the bug label Jul 18, 2016
@javefang
Copy link
Author

Everything else seems to be working fine. I'm only seeing such events once every few hours.

This only seems to be a problem since Kubernetes 1.3.0, previously Traefik works fine with 1.2.4.

@javefang
Copy link
Author

The latest 1.3.2 seems to include a watch cache filtering related fix kubernetes/kubernetes#28966

I'll try to upgrade today to see if it's related.

@jonaz
Copy link
Contributor

jonaz commented Aug 29, 2016

@javefang Did the upgrade help?

@javefang
Copy link
Author

Our cluster admin haven't got a chance to upgrade that. Although we recently set up a second cluster which I'm not seeing a single crash yet (4 days). Will keep an eye on it and try to figure out the difference.

@jimmycuadra
Copy link
Contributor

We've been having this problem too. Traefik pods crash roughly 20 times a day because of it. Here's an example from our logs:

time="2016-09-14T20:28:30Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.3.0.1:443/api/v1/endpoints?watch&resourceVersion=12539224\" : invalid character 'p' looking for beginning of value"

This is using Kubernetes 1.3.7, which is the latest stable release as of today.

@emilevauge
Copy link
Member

FYI, this is fixed on the master (you can try with containous/traefik:experimental image)

@javefang
Copy link
Author

will try it out (interested to know what was the fix if you can point out the commits?)

@emilevauge
Copy link
Member

emilevauge commented Sep 22, 2016

@javefang here it is: #628
This is more a workaround to avoid traefik crash because of too much api connections.
A real fix will come with #678 :)

@jimmycuadra
Copy link
Contributor

Sweet! If you can remember, please let us know when the changes in #628 get rolled out to the latest tag on the Docker Hub image.

@jimmycuadra
Copy link
Contributor

Still having this problem with the latest image on the Docker hub. It seems the problem either wasn't fixed or a new problem is causing the same symptom. #732 seems to have the latest discussion.

@rio
Copy link
Contributor

rio commented Nov 15, 2016

I'm having the same problems. For us they mostly surface when uploading large files through traefik.

Edit: I meant it got noticed through uploading. I don't think it's related to the large files. I'm getting the same panics in the api server as @emilevauge pointed out in #732

@jaygorrell
Copy link

Tried out the new v1.1.0 Traefik image and still have containers dying off in kubernetes 1.4.6 with this problem.

@george-angel
Copy link

+1

@jimmycuadra
Copy link
Contributor

If #836 is the fix for this, it won't be fixed until Traefik 1.2.

@jonaz
Copy link
Contributor

jonaz commented Dec 19, 2016

Is this really fixed in v1.1.2 as the changelog states? @emilevauge

@ldez ldez added the kind/bug/confirmed a confirmed bug (reproducible). label Apr 29, 2017
ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 5, 2018
…rated by kubernetes-client

Signed-off-by: Ilya Buziuk <[email protected]>
ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 6, 2018
…ng (preserve order of query params / supporting & in path)

Signed-off-by: Ilya Buziuk <[email protected]>
aslakknutsen pushed a commit to fabric8-services/fabric8-oso-proxy that referenced this issue Jun 6, 2018
…ng (preserve order of query params / supporting & in path) (#25)

Signed-off-by: Ilya Buziuk <[email protected]>
ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 6, 2018
aslakknutsen pushed a commit to fabric8-services/fabric8-oso-proxy that referenced this issue Jun 6, 2018
nurali-techie pushed a commit to nurali-techie/fabric8-oso-proxy that referenced this issue Jun 26, 2018
…ng (preserve order of query params / supporting & in path) (fabric8-services#25)

Signed-off-by: Ilya Buziuk <[email protected]>
nurali-techie pushed a commit to nurali-techie/fabric8-oso-proxy that referenced this issue Jun 26, 2018
@traefik traefik locked and limited conversation to collaborators Sep 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants