WIP: make it so that journald logs are ascribed to containers #247

nalind · 2017-05-10T21:34:29Z

Looking for feedback on this.

Currently, when we log container output to the journal, dockerd is doing the work of forwarding each piece of output to journald. For every piece of data that journald receives over its native socket, it looks up the unit of the process that sent that piece of data, and applies rate limiting logic to all of the data it receives from that unit. As a result, one container spewing massive numbers of log messages can effectively cause another container's output to be squelched due to rate limiting.

This PR modifies dockerd's journald logging driver to add a new process to each container's scope. That process's sole job is to receive messages over a pipe and forward them to the journal. The dockerd journald logging driver starts that process along with the container, feeds it each message, and stops it along with the container.

In my limited testing with journalctl -o export after this is applied, the journal is attributing log messages to the container's scope, so journalctl -u docker-$CONTAINERID.scope should also produce the container's logs.

When we start a container, stash a copy of the runtime spec that we generated for it in the container object. Signed-off-by: Nalin Dahyabhai <[email protected]>

Make logger.Context into an object that also carries the cgroup path of the container for which it's logging, and initialize the field when we start a container. Signed-off-by: Nalin Dahyabhai <[email protected]>

nalind · 2017-05-10T21:35:12Z

(5) PoC dockerd logging to journald in the container's cgroup

rhatdan · 2017-05-11T12:33:09Z

@vbatts @runcom @mrunalp PTAL

nalind · 2017-05-11T13:56:06Z

Hmm, we don't have the "unified" controller on RHEL 7, and I suppose it's better to try to continue after failing to join the container's cgroup than to just exit and let the daemon log pipe-closed errors each time it tries to log some output.

nalind · 2017-05-11T14:22:06Z

Factored the logging of error messages in the helper, and we're now joining the container's cgroup in all available controllers.

rh-atomic-bot · 2017-05-11T14:44:41Z

RHEL system level integration testing for de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL

Fedora system level integration testing for de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/27/247-system-level-results.txt

rh-atomic-bot · 2017-05-11T15:44:42Z

RHEL system level integration testing for 927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL

Fedora system level integration testing for 927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/28/247-system-level-results.txt

nalind · 2017-05-11T17:38:17Z

Hmm, for some reason I'm seeing device-busy errors after exiting the container, which suggests that the log helper is keeping the container's filesystem busy, which is rather unexpected.

rh-atomic-bot · 2017-05-11T18:05:24Z

RHEL system level integration testing for b5077e115714f5e130678da07cebe459a3bf5eee- FAIL

Fedora system level integration testing for b5077e115714f5e130678da07cebe459a3bf5eee- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/29/247-system-level-results.txt

nalind · 2017-05-11T18:28:02Z

Scratch that, I'm seeing that with json-file as well, so it's probably not a bug that's being introduced here.

Previously, dockerd just relayed messages by itself from containers to the journal, which caused journald to apply rate limiting to messages across all containers as a single group. Here, we add another process to each container's cgroup, and have dockerd forward messages to that process over a pipe. That process, named "journal-logger", receives the messages and sends them on to the journal. As part of the container's cgroup, it's killed when the main process exits, so we only need to close the pipe and read its exit status when closing the logger. Signed-off-by: Nalin Dahyabhai <[email protected]>

rh-atomic-bot · 2017-05-11T19:06:14Z

RHEL system level integration testing for 62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL

Fedora system level integration testing for 62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/30/247-system-level-results.txt

mrunalp · 2017-05-17T16:49:20Z

daemon/logger/journald/journald.go

+	// to assume that we know how to compute the same path that runc is
+	// going to use, based on a value of the form "parent:docker:ID", where
+	// the "docker" is literal.
+	parts := strings.Split(scope, ":")


This will only work when cgroupdriver is set to systemd. We also need to handle cgroupfs (which is the upstream default) if we plan to try and upstream this.

You could also look into using libcontainer cgroups library for doing this as it handles both the drivers.

Doing the derivation right using libcontainer makes sense, since it's already a dependency. I'll have another look at that. Though that does leave open a question - how often do we see a running journald when systemd isn't managing cgroups?

cgwalters · 2017-05-17T17:13:16Z

I didn't read the code, but the concept of this change sounds good to me.

nalind added 2 commits May 10, 2017 17:20

Keep a copy of the container's runtime spec

668e71a

When we start a container, stash a copy of the runtime spec that we generated for it in the container object. Signed-off-by: Nalin Dahyabhai <[email protected]>

Have daemon/logger.Context carry cgroup info

b96df77

Make logger.Context into an object that also carries the cgroup path of the container for which it's logging, and initialize the field when we start a container. Signed-off-by: Nalin Dahyabhai <[email protected]>

nalind force-pushed the journal-scopes branch from afc060a to de52adf Compare May 11, 2017 14:21

nalind force-pushed the journal-scopes branch from de52adf to 9274918 Compare May 11, 2017 15:15

nalind force-pushed the journal-scopes branch from 9274918 to b5077e1 Compare May 11, 2017 17:38

nalind force-pushed the journal-scopes branch from b5077e1 to 62f4ebc Compare May 11, 2017 18:39

mrunalp reviewed May 17, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: make it so that journald logs are ascribed to containers #247

WIP: make it so that journald logs are ascribed to containers #247

nalind commented May 10, 2017

nalind commented May 10, 2017

rhatdan commented May 11, 2017

nalind commented May 11, 2017

nalind commented May 11, 2017

rh-atomic-bot commented May 11, 2017

rh-atomic-bot commented May 11, 2017

nalind commented May 11, 2017

rh-atomic-bot commented May 11, 2017

nalind commented May 11, 2017 •

edited

Loading

rh-atomic-bot commented May 11, 2017

mrunalp May 17, 2017

mrunalp May 17, 2017

nalind May 17, 2017

cgwalters commented May 17, 2017

WIP: make it so that journald logs are ascribed to containers #247

Are you sure you want to change the base?

WIP: make it so that journald logs are ascribed to containers #247

Conversation

nalind commented May 10, 2017

nalind commented May 10, 2017

rhatdan commented May 11, 2017

nalind commented May 11, 2017

nalind commented May 11, 2017

rh-atomic-bot commented May 11, 2017

rh-atomic-bot commented May 11, 2017

nalind commented May 11, 2017

rh-atomic-bot commented May 11, 2017

nalind commented May 11, 2017 • edited Loading

rh-atomic-bot commented May 11, 2017

mrunalp May 17, 2017

Choose a reason for hiding this comment

mrunalp May 17, 2017

Choose a reason for hiding this comment

nalind May 17, 2017

Choose a reason for hiding this comment

cgwalters commented May 17, 2017

nalind commented May 11, 2017 •

edited

Loading