Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 Socket test failures on Travis #27788

Closed
martinholters opened this issue Jun 26, 2018 · 9 comments
Closed

IPv6 Socket test failures on Travis #27788

martinholters opened this issue Jun 26, 2018 · 9 comments
Labels
ci Continuous integration test This change adds or pertains to unit tests

Comments

@martinholters
Copy link
Member

In https://travis-ci.org/JuliaLang/julia/jobs/396261047, there are two interesting failures in the Sockets tests:

  Got exception UDP send failed: address not available (EADDRNOTAVAIL) outside of a @test
  UDP send failed: address not available (EADDRNOTAVAIL)
  Stacktrace:
   [1] try_yieldto(::typeof(Base.ensure_rescheduled), ::Base.RefValue{Task}) at ./event.jl:196
   [2] wait() at ./event.jl:255
   [3] wait(::Condition) at ./event.jl:46
   [4] stream_wait(::Sockets.UDPSocket, ::Condition) at ./stream.jl:47
   [5] send(::Sockets.UDPSocket, ::Sockets.IPv6, ::UInt16, ::String) at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v0.7/Sockets/src/Sockets.jl:355
   [6] top-level scope at /tmp/julia/share/julia/stdlib/v0.7/Sockets/test/runtests.jl:237

and

  Got exception InexactError(:trunc, UInt16, 65536) outside of a @test
  InexactError: trunc(UInt16, 65536)
  Stacktrace:
   [1] throw_inexacterror(::Symbol, ::Type, ::Any) at ./boot.jl:567
   [2] checked_trunc_uint at ./boot.jl:597 [inlined]
   [3] toUInt16 at ./boot.jl:669 [inlined]
   [4] Type at ./boot.jl:720 [inlined]
   [5] convert at ./number.jl:7 [inlined]
   [6] Type at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v0.7/Sockets/src/IPAddr.jl:249 [inlined]
   [7] Type at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v0.7/Sockets/src/IPAddr.jl:253 [inlined]
   [8] listenany(::Sockets.IPv6, ::UInt16) at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v0.7/Sockets/src/Sockets.jl:571

My guess is that the +1 on these lines may result in the invalid port 65536, causing the respective exceptions:

bind(b, ip"127.0.0.1", randport + 1)
c = Condition()
tsk = @async begin

addr = InetAddr(addr.host, addr.port + 1)

@martinholters martinholters changed the title Sockets: In valid port due to +1? Sockets: Invalid port due to +1? Jun 26, 2018
@StefanKarpinski
Copy link
Member

These should both probably wrap around at typemax(UInt16). cc @Keno, @vtjnash

@Keno
Copy link
Member

Keno commented Jun 26, 2018

These should both probably wrap around at typemax(UInt16)

Yes, that seems right. Though with an error if it wraps all the way around and we may want to skip the privileged ports.

@Keno
Copy link
Member

Keno commented Jul 1, 2018

Actually @vtjnash points out that in order for to happen, literally all ports need to be taken up (since we start the listenany at a random port between 2000 and 4000). It would be good to dump to temporarily change this test to dump out netstat -an when this test fails so we can get an idea of why the ports are all taken).

@c42f
Copy link
Member

c42f commented Jan 30, 2019

I've come across this a lot lately. Some more examples:

https://travis-ci.org/JuliaLang/julia/jobs/486248172
https://travis-ci.org/JuliaLang/julia/jobs/486248173

[edit: seems like those links are now broken after restarting the travis jobs :-( ]

@c42f
Copy link
Member

c42f commented Jan 30, 2019

Ok after some digging I think this is because GCE (and hence Travis) doesn't support ipv6. Not even on the loopback device :-(

To repro this on a linux machine, disable ipv6

sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1

Then run the Sockets tests.

See also
travis-ci/travis-ci#3302
restify/node-restify#1545 (comment)

@c42f
Copy link
Member

c42f commented Jan 31, 2019

GCE (and hence Travis) doesn't support ipv6

... It's more complicated than this as successful Travis jobs also appear to run on the same GCE infrastructure, at least on the face of it. But regardless of that the repro seems solid.

@vtjnash
Copy link
Member

vtjnash commented Jan 31, 2019

Yes, we run the inverse of that line to re-enable IPv6 support after TravisCI approximately executes the commands you gave to disable it. That's why the tests usually pass, since stock GCE in the default configuration doesn't fail these tests. It's also why we run with sudo-enabled builds (where we can undo their mistake) and are unable to run in the containerized environments.

sudo sh -c "echo 0 > /proc/sys/net/ipv6/conf/lo/disable_ipv6";

this is because GCE

It's not, although it seems like something must be different about those machines. This seems to be a misinformation campaign spread by CI companies (CircleCI makes the same unnecessary claim*). The information in travis-ci/travis-ci#3302 is pretty absurd, since they blame the cloud host, GCE, AWS, etc. for the lack of IPv6 on loopback, but the file that disables it is actually something that they inject into the machine configuration and was not present in the normal GCE environment.

* similar discussions on CircleCI include https://discuss.circleci.com/t/ipv6-support/13571/9, https://ideas.circleci.com/ideas/CCI-I-571. Although, IIRC, we stopped using them because we couldn't run the Socket tests there.

@c42f
Copy link
Member

c42f commented Jan 31, 2019

Thanks Jameson. I see you added that Travis tweak in #22986. People saw similar symptoms way back in #13076.

Even with the sudo build and the line to re-enable ipv6 it looks like it's sporadically unavailable (at least, turning it off locally gives the exact same error messages which I've seen on these failed builds).

@c42f c42f changed the title Sockets: Invalid port due to +1? Intermittent Socket test failures on Travis with IPv6 Feb 12, 2019
@c42f c42f changed the title Intermittent Socket test failures on Travis with IPv6 IPv6 Socket test failures on Travis Feb 12, 2019
@c42f c42f added test This change adds or pertains to unit tests ci Continuous integration labels Feb 12, 2019
@c42f
Copy link
Member

c42f commented Sep 18, 2019

Well since Travis use is discontinued now, I guess we can close this 🎉

@c42f c42f closed this as completed Sep 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration test This change adds or pertains to unit tests
Projects
None yet
Development

No branches or pull requests

5 participants