Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let pasta configure interface, fix IPv6 outbound connectivity #458

Merged
merged 2 commits into from
Aug 17, 2024

Conversation

sbrivio-rh
Copy link
Contributor

@@ -187,7 +187,7 @@ func setupCopyDir(driver copyup.ChildDriver, dirs []string) (bool, error) {
// setupNet sets up the network driver.
//
// NOTE: msg is altered during calling driver.ConfigureNetworkChild
func setupNet(stateDir string, msg *messages.ParentInitNetworkDriverCompleted, etcWasCopied bool, driver network.ChildDriver, detachedNetNSPath string) error {
func setupNet(stateDir string, msg *messages.ParentInitNetworkDriverCompleted, etcWasCopied bool, driver network.ChildDriver, driverConfiguresNet bool, detachedNetNSPath string) error {
Copy link
Member

@AkihiroSuda AkihiroSuda Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably ChildDriver should have a method like ChildDriverInfo() , then setupNet does not need to have the driverConfiguresNet arg

@sbrivio-rh
Copy link
Contributor Author

Oops, looking into UDP failure now...

//
// Workaround: set the kernel.apparmor_restrict_unprivileged_userns
// sysctl to 0, or (preferred) add the AppArmor profile from upstream,
// or from Debian packages, or from Ubuntu > 24.10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AppArmor stuff can be also a separate PR

@AkihiroSuda
Copy link
Member

CI is failing https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458

+ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
time="2024-0[8](https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458#step:15:9)-16T10:22:22Z" level=warning msg="[rootlesskit:parent] \"pasta\" network driver is experimental. Needs very recent version of pasta (see docs/network.md)."
;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with [10](https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458#step:15:11).0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached


[rootlesskit:child ] error: command [nslookup example.com] exited: exit status 1
[rootlesskit:parent] error: child exited: exit status 1

@@ -112,11 +108,9 @@ func (d *parentDriver) ConfigureNetwork(childPID int, stateDir, detachedNetNSPat

opts := []string{
"--foreground",
"--stderr",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commit message:

Drop --stderr as well: it doesn't do anything anymore, and it has
been obsoleted in pasta for a while (it will always print to stderr
when starting in foreground anyway).

That was deprecated in passt's commit bca0fefa32e0 ("conf, passt: Make --stderr do nothing, and deprecate it"), included in upstream version 2024_06_24.1ee2eca. We can leave it for backwards compatibility, but in general we were already printing most stuff on stderr when running with --foreground anyway.

@AkihiroSuda AkihiroSuda added this to the v2.2.1 milestone Aug 16, 2024
@sbrivio-rh
Copy link
Contributor Author

Oops, looking into UDP failure now...

Weird, that works for me on Debian unstable, passt-0.0~git20240814.61c0b0d-1:

$ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
WARN[0000] [rootlesskit:parent] "pasta" network driver is experimental. Needs very recent version of pasta (see docs/network.md). 
Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	example.com
Address: 93.184.215.14
Name:	example.com
Address: 2606:2800:21f:cb07:6820:80da:af6b:8b2c

@sbrivio-rh
Copy link
Contributor Author

CI is failing https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458

+ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
time="2024-0[8](https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458#step:15:9)-16T10:22:22Z" level=warning msg="[rootlesskit:parent] \"pasta\" network driver is experimental. Needs very recent version of pasta (see docs/network.md)."
;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with [10](https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458#step:15:11).0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached


[rootlesskit:child ] error: command [nslookup example.com] exited: exit status 1
[rootlesskit:parent] error: child exited: exit status 1

I saw it :( it works for me on Debian unstable, passt-0.0~git20240814.61c0b0d-1:

$ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
WARN[0000] [rootlesskit:parent] "pasta" network driver is experimental. Needs very recent version of pasta (see docs/network.md). 
Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	example.com
Address: 93.184.215.14
Name:	example.com
Address: 2606:2800:21f:cb07:6820:80da:af6b:8b2c

@sbrivio-rh
Copy link
Contributor Author

CI is failing https://github.com/rootless-containers/rootlesskit/actions/runs/10418395143/job/28854469268?pr=458

I can reproduce this using the rootlesskit:test-integration image. The outcome depends on the timing, sometimes I get:

$ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
WARN[0000] [rootlesskit:parent] "pasta" network driver is experimental. Needs very recent version of pasta (see docs/network.md). 
;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

sometimes:

$ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
WARN[0000] [rootlesskit:parent] "pasta" network driver is experimental. Needs very recent version of pasta (see docs/network.md). 
;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

;; UDP setup with 10.0.2.3#53(10.0.2.3) for example.com failed: network unreachable.
;; no servers could be reached

Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	example.com
Address: 93.184.215.14
Name:	example.com
Address: 2606:2800:21f:cb07:6820:80da:af6b:8b2c

(second attempt worked here). And sometimes:

$ rootlesskit --net=pasta --copy-up=/etc --copy-up=/run --disable-host-loopback -- nslookup example.com
WARN[0000] [rootlesskit:parent] "pasta" network driver is experimental. Needs very recent version of pasta (see docs/network.md). 
Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	example.com
Address: 93.184.215.14
Name:	example.com
Address: 2606:2800:21f:cb07:6820:80da:af6b:8b2c

that's because if we drop the invocation of a number of ip(8) commands, the container will be ready sooner, but pasta isn't necessarily ready to handle traffic.

The problem here is that with --foreground and cmd.Start(), you don't have a synchronisation point: pasta forks when it's ready. But if it doesn't fork, you don't know when.

I would switch this to the same mechanism that Podman uses, that is, exec.Command() without --foreground. Once we're done with exec.Command(), we know we can proceed, and pasta will exit anyway once the namespace is gone, so there's no need for explicit cleanup. Would you see any issue with that?

@AkihiroSuda
Copy link
Member

I would switch this to the same mechanism that Podman uses, that is, exec.Command() without --foreground. Once we're done with exec.Command(), we know we can proceed, and pasta will exit anyway once the namespace is gone, so there's no need for explicit cleanup. Would you see any issue with that?

SGTM, if an abnormal exit status of pasta can be still caught

@sbrivio-rh sbrivio-rh force-pushed the master branch 2 times, most recently from 3e32587 to a3fa42c Compare August 16, 2024 13:42
@sbrivio-rh
Copy link
Contributor Author

I would switch this to the same mechanism that Podman uses, that is, exec.Command() without --foreground. Once we're done with exec.Command(), we know we can proceed, and pasta will exit anyway once the namespace is gone, so there's no need for explicit cleanup. Would you see any issue with that?

SGTM, if an abnormal exit status of pasta can be still caught

Yes, that's done anyway by CombinedOutput(). I just pushed the relevant commit. I still have this open:

Probably ChildDriver should have a method like ChildDriverInfo() , then setupNet does not need to have the driverConfiguresNet arg

but the rest should be fixed.

pasta(1), like many daemons, will fork to background only once it's
ready to operate (handle traffic). On top of that, it doesn't need an
explicit clean-up phase (unless --no-netns-quit is given), because it
will terminate as soon as the target network namespace goes away.

If we make it run in foreground and proceed as soon as we invoked the
command (in background), we'll claim we're done too early, and tests
such as integration-net.sh will occasionally fail, because pasta is
still initialising sockets and interfaces as UDP packets are sent.

Drop --foreground, and use CombinedOutput() instead of the Start()
method of exec.Command(). Drop the explicit cleanup steps, too.

Signed-off-by: Stefano Brivio <[email protected]>
Dockerfile Outdated
@@ -1,4 +1,4 @@
ARG GO_VERSION=1.22
ARG GO_VERSION=1.23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a bad rebase.
This is already present in the master branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just happened to go back a bit with a rebase and SHA changed, fixed now.

@sbrivio-rh
Copy link
Contributor Author

I still have this open:

Probably ChildDriver should have a method like ChildDriverInfo() , then setupNet does not need to have the driverConfiguresNet arg

Added as WIP commit, I'm not quite sure what I'm doing (I use Go very rarely) but it actually works, polishing up now.

pkg/api/api.go Outdated

type ChildDriverInfo struct {
ConfiguresInterface bool `json:"configuresInterface"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this has to be published to the REST API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I also wanted to avoid that, but I have no idea where else to define this in a way that can be used by all the drivers.

@sbrivio-rh sbrivio-rh changed the title Let pasta configure interface, fix IPv6 outbound connectivity, update comment about AppArmor issue Let pasta configure interface, fix IPv6 outbound connectivity Aug 16, 2024
@@ -24,4 +24,6 @@ type ChildDriver interface {
// netmsg MAY be modified.
// devName is like "tap" or "eth0"
ConfigureNetworkChild(netmsg *messages.ParentInitNetworkDriverCompleted, detachedNetNSPath string) (devName string, err error)

ChildDriverInfo() (*api.ChildDriverInfo, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChildDriverInfo struct can be defined in this pkg/network, not in pkg/api.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return nil, common.Seq(cleanups), fmt.Errorf("executing %v: %w", cmd, err)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if the background pasta process dies after several minutes/hours/days?
Can RootlessKit detect that death?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't, but if it happens (even though we haven't received this kind of "stability" bug reports for quite a long time now), we would like users to report an issue, because anyway context will be lost (TCP connections, at least), rather than hide this by restarting the process or similar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pasta process can be killed by the kernel mostly due to OOM, and in that case RootlessKit should probably abort, or at least print an error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty much impossible because it doesn't allocate memory dynamically, so I don't think it can actually be picked as a candidate by the OOM killer.

Anyway, the network interface will go away in that case, because we close the tap file descriptor, so something like that could actually be added.

I don't think it's actually needed, because the network interface going away is very obvious, but I guess I can implement it if needed (that will take a bit longer though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda I'm trying to implement this with the LinkSubscribe() method from github.com/vishvananda/netlink.

That part looks trivial, but I don't know where to add this kind of monitor in rootlesskit. I thought there was something like that for slirp4netns' --exit-fd, but I can't find it. How would you detect that slirp4netns terminates?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we lack the death detection for slirp4netns too 😅

Maybe we can use pidfd for kernel >= 5.3.
https://man7.org/linux/man-pages/man2/pidfd_open.2.html

A PID file descriptor can be monitored using poll(2), select(2), and epoll(7).

If we want to implement this too for old kernel, probably just watch SIGCHLD and see if the PID of (pasta|slirp4netns) is still valid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we lack the death detection for slirp4netns too 😅

I feel better now. 😄

I think we should leave out PID tracking, because it keeps the interface to pasta very simple (also in terms of AppArmor and SELinux), and we have a valid, solid alternative (netlink monitor on the network interface). Anyway, this is the simple part.

The complicated part for me, whatever we choose as monitor, is where and how to add this to rootlesskit. I really don't know how it works with... threads? Should we spawn a thread somewhere? Is there something like that already in rootlesskit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to have any monitoring stuff in the current code base, but it can be probably spawned as a goroutine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip, I'll give it a try, but with very low priority, so it might take a while.

pkg/api/api.go Outdated
@@ -32,3 +33,7 @@ type PortDriverInfo struct {
Protos []string `json:"protos"`
DisallowLoopbackChildIP bool `json:"disallowLoopbackChildIP,omitempty"` // since API v1.1.1
}

type ChildDriverInfo struct {
ConfiguresInterface bool `json:"configuresInterface"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment line to explain what this means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added (in network.go, now).

pkg/api/api.go Outdated
@@ -32,3 +33,7 @@ type PortDriverInfo struct {
Protos []string `json:"protos"`
DisallowLoopbackChildIP bool `json:"disallowLoopbackChildIP,omitempty"` // since API v1.1.1
}

type ChildDriverInfo struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be part of the REST API and should be moved to pkg/network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip, moved there.

…ta do that

As reported in moby/moby#48257, when Docker
rootless uses pasta through rootlesskit for user-mode connectivity,
IPv6 can't be used for outbound connections because no addresses and
no routes are configured in the container.

The reason is that rootlesskit won't configure IPv6 addresses on the
interface, and at the same time it doesn't ask pasta to do so using
the --config-net option.

Add a ChildDriverInfo() method to childDriver, returning a single
piece of information, that is, whether the driver configures the
network interface by itself, which is true only for pasta. If that's
the case, there's no reason to call activateDev() from setupNet().

Further, in the pasta driver, skip the call to PrepareTap(), because
pasta can take care of that as well.

At the same time, ask pasta to do all that: set up the tap device,
and configure IPv4 and IPv6, using --config-net.

While at it, drop options --no-ra and --no-dhcp, as the container
might want to send router solicitations and DHCP requests even if we
permanently configure IPv4 and IPv6 addresses and routes, and
there's no reason to ignore those requests.

Drop --stderr as well: it doesn't do anything anymore, and it has
been obsoleted in pasta for a while (it will always print to stderr
when starting in foreground anyway).

Link: moby/moby#48257
Signed-off-by: Stefano Brivio <[email protected]>
@@ -147,21 +141,18 @@ func (d *parentDriver) ConfigureNetwork(childPID int, stateDir, detachedNetNSPat
// `Couldn't open user namespace /proc/51813/ns/user: Permission denied`
// Possibly related to AppArmor.
cmd := exec.Command(d.binary, opts...)
cmd.Stdout = d.logWriter
cmd.Stderr = d.logWriter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the stderr now ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all in out -- that's what out, err := cmd.CombinedOutput() does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit b5b8e22 into rootless-containers:master Aug 17, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docker rootless and outgoing IPv6
2 participants