Allow multi-networkpolicy to deploy to mlab-oti with canary nodeSelector #913

nkinkade · 2024-10-23T23:03:33Z

The primary intent of this PR is to allow the multi-networkpolicy DaemonSet to deploy to production, but only to nodes with the special label mlab/run: multi-networkpolicy-canary. The plan is to manually label a subset of nodes, perhaps 10 in various metros, to be sure that it works as intended and does not for any reason impact performance.

Additionally, the PR modifies modifies the memory limits for the container. I discovered that for some reason the containers in sandbox only use around 30MB of memory, but in staging they use around 120MB, sometimes spiking higher and causing the container to be killed for going over the old memory limit of 150Mi.

I also made some small modifications to the pod labels to be more in line with how we label other pods.

This change is

I noticed that some multi-networkpolicy containers were getting OOM-killed. Looking at typical memory usage of the container across all staging nodes, the usage seems to hover right around 125MB for most of them, with usage sometimes spiking up to the limit of 150MB and getting OOM-killed. This commit increases the request to 150MB and sets the limit to 250MB. Based on monitoring data, these should be more reasonable settings. Additionally, I changed the label "name" label on the pod to "workload", which is more consistent with how we label other pods. I removed the "app" label, since we don't use that convention and it was, I believe, unused.

For sandbox and staging it's fine to deploy multi-networkpolicy everywhere, but for production we want to do some canaries before deploying widely to be sure that it works as intended and that it doesn't impact performance in any way. Also, increase the memory limit to 500Mi. I have noticed that on physical machines in sandbox multi-networkpolicy only uses around 30 or 40MB, but in staging it uses more like 120MB. The only way I can account for this is that staging has more nodes and perhaps the service somehow keeps track other nodes for some reason.

For now it has a canary nodeSelector, so it will only to deploy to nodes that we manually label with that nodeSelector.

stephen-soltesz

An unrelated request to please tag the measurementlab/multi-networkpolicy-iptables image so we're not running an unversioned image in production.

Reviewed 2 of 2 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @nkinkade)

k8s/daemonsets/core/multi-networkpolicy.jsonnet line 44 at r1 (raw file):

          {
            name: 'multi-networkpolicy',
            image: 'measurementlab/multi-networkpolicy-iptables:latest',

Please additionally tag the version we're using in production.

nkinkade

I set up automated builds for this in Docker Hub, and released a v1.0.0 of our fork of the repo. The container image for the DaemonSet is now using the image built against the v1.0.0 release.

Reviewable status: 0 of 1 approvals obtained (waiting on @stephen-soltesz)

k8s/daemonsets/core/multi-networkpolicy.jsonnet line 44 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Please additionally tag the version we're using in production.

Done.

nkinkade added 3 commits October 22, 2024 11:48

Allows multi-networkpolicy DS to deploy to mlab-oti

1547aaf

For now it has a canary nodeSelector, so it will only to deploy to nodes that we manually label with that nodeSelector.

nkinkade requested a review from stephen-soltesz October 23, 2024 23:03

stephen-soltesz approved these changes Oct 24, 2024

View reviewed changes

Uses versioned image for multi-networkpolicy DaemonSet

940e556

nkinkade commented Oct 24, 2024

View reviewed changes

nkinkade merged commit 9358136 into main Oct 24, 2024
2 of 3 checks passed

nkinkade deleted the sandbox-kinkade branch October 24, 2024 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow multi-networkpolicy to deploy to mlab-oti with canary nodeSelector #913

Allow multi-networkpolicy to deploy to mlab-oti with canary nodeSelector #913

nkinkade commented Oct 23, 2024 •

edited by stephen-soltesz

Loading

stephen-soltesz left a comment

nkinkade left a comment

Allow multi-networkpolicy to deploy to mlab-oti with canary nodeSelector #913

Allow multi-networkpolicy to deploy to mlab-oti with canary nodeSelector #913

Conversation

nkinkade commented Oct 23, 2024 • edited by stephen-soltesz Loading

stephen-soltesz left a comment

Choose a reason for hiding this comment

nkinkade left a comment

Choose a reason for hiding this comment

nkinkade commented Oct 23, 2024 •

edited by stephen-soltesz

Loading