-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove support for Canal and the vxlan Flannel backend #8614
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: johngmyers The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
017d8c1
to
96e044d
Compare
There's activity on the underlying bug, so |
/test pull-kops-verify |
Yeah this thread also has some activity, the maintainer is referenced as well: flannel-io/flannel#1245 |
It would be great if the flannel folks get this fixed up, but there doesn't seem to be a lot of movement. |
I think I'll asses the state in 2-3 weeks. If no progress by then I'll un-hold. This should be a 1.17 blocker. |
96e044d
to
5451062
Compare
How does this affect running flannel in AWS ? Even with disabling Source/Destination Check on the nodes, I was not able to get flannel with |
5451062
to
6d90881
Compare
I don't see any progress. |
6d90881
to
54b688f
Compare
Decision from Kops Office Hours is to defer for two weeks. |
Decision of kops office hours is to proceed. |
54b688f
to
2b5e086
Compare
2b5e086
to
2f5a405
Compare
/retest |
There's an open PR now to fix this issue at flannel-io/flannel#1282. |
I've also (so far) only been able to reproduce the problem on rhel 7.8 images. |
And I actually think that #8381 might be a fix. |
I'm trying to get some flannel tests going on "every distro": kubernetes/test-infra#17300 |
/hold |
Just a data point, I tried adding the sysctls from the CentOS7 fix to a Flatcar-based cluster and didn't see any change in flannel's behaviour. Trying to figure out the best way to shoehorn the ethtool/checksum fix in for a test now... |
Seems that the proper fix was accepted: kubernetes/kubernetes#88986 (comment). Getting it to already released distros will probably take a long time, if it ever happens. The patch comment says that For now the best hope for the near future is the flannel-io/flannel#1282, which will require a new Flannel release. |
I'm not sure the "proper" fix is the whole story here. I rolled a version of flatcar with the cited kernel patch and see no difference in behaviour for flannel vxlan / canal. |
I can however now confirm using ethtool to disable offloading works. |
@jhohertz just curious, how did you try flatcar with that patch? |
@hakman I forked the manifest and coreos-overlay repositories to add the patch. Running cork create for my attempt looks like: The backport had one minor difference for 4.19 in call signatures but otherwise the logic was identical. There might be other differences that need accounting for, but the patch is so tiny... At any rate, my attempt lives here in the context of the build setup above: https://github.com/viafoura/flatcar-coreos-overlay/blob/viafoura-build-2345.3.1/sys-kernel/coreos-sources/files/4.19/z0004-netfilter-nat-never-update-the-UDP-checksum-when-it-.patch |
Re: the ethtool thing, I'm going to try to adapt this into a hooks yaml fragment suitable for a cluster spec. Not really used that bit of functionality before, I'll share if I get a working example. |
At first glance, this seems to work in a cluster spec to implement the workaround in a not terrible way. FAR from extensively tested. And of course if this is actually disabling real hardware offload, this isn't an ideal fix, but it may be the best we can get for now.
|
@jhohertz That is not such a bad idea. May be something to think about adding in Kops as a workaround until a new version of Flannel is released. |
Closing in favor of #9074 |
Due to flannel-io/flannel#1243
Targeting kops 1.17
Fixes #8562