Cannot connect to Kubernetes server failed to decode watch event #532

javefang · 2016-07-14T11:53:33Z

Hi,

I was using Traefik v1.0.0 with Kubernetes 1.3.0. Randomly I'm getting invalid character errors and it crashes Traefik after a few failures. Any one experiencing the same problem?

time="2016-07-14T10:58:09Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2928703\" : invalid character '\\n' in string literal" 
time="2016-07-14T10:19:06Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2924878\" : invalid character 'a' looking for beginning of value" 
time="2016-07-14T09:00:01Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2916459\" : invalid character 'o' looking for beginning of value" 
time="2016-07-14T08:15:02Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2911726\" : invalid character '}' looking for beginning of value"

Thanks!

The text was updated successfully, but these errors were encountered:

jonaz · 2016-07-14T14:06:49Z

If you manually start a pod. For example a busy box using:
kubectl run -i --tty busybox --image=busybox --restart=Never

then manually try to fetch the url:

wget "https://10.254.0.1:443/api/v1/endpoints?watch&resourceVersion=2928703"

and check the file with cat or less?

I do think you should install dns clusteraddon and use https://kubernetes:443 url for the api. Or is 10.254.0.0/16 your service range?

javefang · 2016-07-15T08:34:59Z

I do have the dns addon. 10.254.0.1 is the service endpoint for the apiserver in my case. I've set up a pod that continuously monitors the strings out of the endpoint watch URL and report back when I see an error.

emilevauge · 2016-07-18T21:34:28Z

Maybe the API evolved in k8s 1.3. We will have a look at this.
Other than weird logs, everything is working right ?
/cc @errm

javefang · 2016-07-19T06:47:16Z

Everything else seems to be working fine. I'm only seeing such events once every few hours.

This only seems to be a problem since Kubernetes 1.3.0, previously Traefik works fine with 1.2.4.

javefang · 2016-07-19T06:50:41Z

The latest 1.3.2 seems to include a watch cache filtering related fix kubernetes/kubernetes#28966

I'll try to upgrade today to see if it's related.

jonaz · 2016-08-29T10:41:43Z

@javefang Did the upgrade help?

javefang · 2016-09-12T20:58:02Z

Our cluster admin haven't got a chance to upgrade that. Although we recently set up a second cluster which I'm not seeing a single crash yet (4 days). Will keep an eye on it and try to figure out the difference.

jimmycuadra · 2016-09-14T20:34:02Z

We've been having this problem too. Traefik pods crash roughly 20 times a day because of it. Here's an example from our logs:

time="2016-09-14T20:28:30Z" level=fatal msg="Cannot connect to Kubernetes server failed to decode watch event: GET \"https://10.3.0.1:443/api/v1/endpoints?watch&resourceVersion=12539224\" : invalid character 'p' looking for beginning of value"

This is using Kubernetes 1.3.7, which is the latest stable release as of today.

emilevauge · 2016-09-22T12:25:26Z

FYI, this is fixed on the master (you can try with containous/traefik:experimental image)

javefang · 2016-09-22T13:27:03Z

will try it out (interested to know what was the fix if you can point out the commits?)

emilevauge · 2016-09-22T13:58:05Z

@javefang here it is: #628
This is more a workaround to avoid traefik crash because of too much api connections.
A real fix will come with #678 :)

jimmycuadra · 2016-09-22T20:26:15Z

Sweet! If you can remember, please let us know when the changes in #628 get rolled out to the latest tag on the Docker Hub image.

jimmycuadra · 2016-11-09T02:15:02Z

Still having this problem with the latest image on the Docker hub. It seems the problem either wasn't fixed or a new problem is causing the same symptom. #732 seems to have the latest discussion.

rio · 2016-11-15T16:47:29Z

I'm having the same problems. For us they mostly surface when uploading large files through traefik.

Edit: I meant it got noticed through uploading. I don't think it's related to the large files. I'm getting the same panics in the api server as @emilevauge pointed out in #732

jaygorrell · 2016-11-27T22:54:56Z

Tried out the new v1.1.0 Traefik image and still have containers dying off in kubernetes 1.4.6 with this problem.

george-angel · 2016-11-27T23:02:34Z

+1

jimmycuadra · 2016-12-01T11:41:43Z

If #836 is the fix for this, it won't be fixed until Traefik 1.2.

jonaz · 2016-12-19T07:36:48Z

Is this really fixed in v1.1.2 as the changelog states? @emilevauge

…rated by kubernetes-client Signed-off-by: Ilya Buziuk <[email protected]>

…ng (preserve order of query params / supporting & in path) Signed-off-by: Ilya Buziuk <[email protected]>

…ng (preserve order of query params / supporting & in path) (#25) Signed-off-by: Ilya Buziuk <[email protected]>

…essing Signed-off-by: Ilya Buziuk <[email protected]>

…essing (#26) Signed-off-by: Ilya Buziuk <[email protected]>

…ng (preserve order of query params / supporting & in path) (fabric8-services#25) Signed-off-by: Ilya Buziuk <[email protected]>

…essing (fabric8-services#26) Signed-off-by: Ilya Buziuk <[email protected]>

emilevauge added the bug label Jul 18, 2016

vdemeester added the area/provider/k8s/ingress label Jul 19, 2016

yvespp mentioned this issue Nov 13, 2016

Migrate k8s to kubernetes/client-go #836

Merged

vdemeester closed this as completed in #836 Dec 1, 2016

ldez added the kind/bug/confirmed a confirmed bug (reproducible). label Apr 29, 2017

ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 5, 2018

rh-che traefik#532: Adding workaround for processing 'exec' urls gene…

264d9a9

…rated by kubernetes-client Signed-off-by: Ilya Buziuk <[email protected]>

ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 6, 2018

rh-che traefik#532: Adding extra workarounds for identity_id processi…

dbf05c7

…ng (preserve order of query params / supporting & in path) Signed-off-by: Ilya Buziuk <[email protected]>

ibuziuk added a commit to ibuziuk/fabric8-oso-proxy that referenced this issue Jun 6, 2018

rh-che traefik#532 Adding workaround with extra query parameters proc…

b66f1fc

…essing Signed-off-by: Ilya Buziuk <[email protected]>

ibuziuk mentioned this issue Jun 6, 2018

rh-che #532 Adding workaround with extra query parameters processing fabric8-services/fabric8-oso-proxy#26

Merged

2 tasks

aslakknutsen pushed a commit to fabric8-services/fabric8-oso-proxy that referenced this issue Jun 6, 2018

rh-che traefik#532 Adding workaround with extra query parameters proc…

90e72b8

…essing (#26) Signed-off-by: Ilya Buziuk <[email protected]>

nurali-techie pushed a commit to nurali-techie/fabric8-oso-proxy that referenced this issue Jun 26, 2018

rh-che traefik#532 Adding workaround with extra query parameters proc…

e1a1c48

…essing (fabric8-services#26) Signed-off-by: Ilya Buziuk <[email protected]>

traefik locked and limited conversation to collaborators Sep 1, 2019

traefiker added the status/5-frozen-due-to-age label Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot connect to Kubernetes server failed to decode watch event #532

Cannot connect to Kubernetes server failed to decode watch event #532

javefang commented Jul 14, 2016 •

edited

Loading

jonaz commented Jul 14, 2016

javefang commented Jul 15, 2016

emilevauge commented Jul 18, 2016

javefang commented Jul 19, 2016

javefang commented Jul 19, 2016

jonaz commented Aug 29, 2016

javefang commented Sep 12, 2016

jimmycuadra commented Sep 14, 2016

emilevauge commented Sep 22, 2016

javefang commented Sep 22, 2016

emilevauge commented Sep 22, 2016 •

edited

Loading

jimmycuadra commented Sep 22, 2016

jimmycuadra commented Nov 9, 2016

rio commented Nov 15, 2016 •

edited

Loading

jaygorrell commented Nov 27, 2016

george-angel commented Nov 27, 2016

jimmycuadra commented Dec 1, 2016

jonaz commented Dec 19, 2016 •

edited

Loading

Cannot connect to Kubernetes server failed to decode watch event #532

Cannot connect to Kubernetes server failed to decode watch event #532

Comments

javefang commented Jul 14, 2016 • edited Loading

jonaz commented Jul 14, 2016

javefang commented Jul 15, 2016

emilevauge commented Jul 18, 2016

javefang commented Jul 19, 2016

javefang commented Jul 19, 2016

jonaz commented Aug 29, 2016

javefang commented Sep 12, 2016

jimmycuadra commented Sep 14, 2016

emilevauge commented Sep 22, 2016

javefang commented Sep 22, 2016

emilevauge commented Sep 22, 2016 • edited Loading

jimmycuadra commented Sep 22, 2016

jimmycuadra commented Nov 9, 2016

rio commented Nov 15, 2016 • edited Loading

jaygorrell commented Nov 27, 2016

george-angel commented Nov 27, 2016

jimmycuadra commented Dec 1, 2016

jonaz commented Dec 19, 2016 • edited Loading

javefang commented Jul 14, 2016 •

edited

Loading

emilevauge commented Sep 22, 2016 •

edited

Loading

rio commented Nov 15, 2016 •

edited

Loading

jonaz commented Dec 19, 2016 •

edited

Loading