Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out the future of the security context feature flag #9067

Closed
dprotaso opened this issue Aug 14, 2020 · 12 comments
Closed

Figure out the future of the security context feature flag #9067

dprotaso opened this issue Aug 14, 2020 · 12 comments
Assignees
Labels
area/API API objects and controllers kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Well-understood/specified features, ready for coding. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Issues which should be fixed (post-triage)
Milestone

Comments

@dprotaso
Copy link
Member

dprotaso commented Aug 14, 2020

This expands on #7924

Given that PR #9060 added a feature flag to unblock folks using EKS we should determine what we want to do long term.

tl;dr exposing the PodSecurityContext may give users control beyond their containers (ie. run user, group, fs group) and this could have unintended side-effects on our queue-proxy & potentially mesh sidecars

Originally posted by @julz in #9060

so one thing I feel a little weird about here that feels worth discussing. This PR ends up not just exposing some new fields on the existing userContainer.SecurityContext (that bit is totally fine), it also exposes the pod security context to the user. Previously the user couldn't touch the pod's security spec, only the spec of their own containers.

I see why we need to do this -- the fsGroup is only present at the pod level, because volumes are shared -- but it seems like quite a big new bit of surface area. For example it means the Queue proxy must now run in whatever podSecurityContext the user supplies (modulo fields that it can explicitly override in the container securityContext).

Few potential upshots of that, off the top of my head:

  • this will affect the ownership of the log volumes we mount in to queue proxy (it will arbitrarily chown them, I think..).
  • any additional sidecars that are injected will also inherit these settings, unless overridden (e.g. Istio in the mesh case, I think).
  • I've had a sneaky idea in the back of my head for a while that it'd be pretty nice to have a custom Queue Proxy that uses cgroups to pause/unpause the user container when there are no requests, for a lambda-like UX -- that likely requires giving QP elevated permissions. (etc)

Anyway, this may well be fine and reasonable, but I wanted to flag this up and make sure we explicitly think it through and decide it's ok.

Originally posted by @dprotaso in #9060

I think this discussion is good and surfaces some potential upstream requirements in K8s (ie. does fsGroup property on the container security context make sense). For some existing pod-spec feature flags I don't think we intent to every change the default from Disabled - this flag may fall into the same class. Thus we should aim to at least let document to operators the issues you've surfaced.

@mattmoor
Copy link
Member

/area API
/kind feature
/assign @dprotaso

@knative-prow-robot knative-prow-robot added area/API API objects and controllers kind/feature Well-understood/specified features, ready for coding. labels Aug 20, 2020
@julz
Copy link
Member

julz commented Sep 5, 2020

kubernetes/enhancements#1598 is quite interesting: landed recently in upstream and would - I think - remove the use case we had for being able to set fsGroup.

@dprotaso
Copy link
Member Author

From @mattmoor : #7923 (comment)

cc @julz @evankanderson

I think the trick here is that setting it at the pod level then applies to the queue-proxy, which may(?) not work in all cases. It might be worth thinking those cases through and hardening the queue-proxy against them (with e2e tests) before relaxing things at this level to match container. This would also have implications for injected sidecars like in mesh cases.

I wonder: does the container securityContext take precedence? If so, one way to "harden" the queue-proxy would simply be to have us specify our constraints on that container directly.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2020
@dprotaso
Copy link
Member Author

/lifecycle frozen

@dprotaso dprotaso reopened this Jan 18, 2021
@knative-prow-robot knative-prow-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 18, 2021
@markusthoemmes
Copy link
Contributor

I was wondering why we constraint the user container's security context that much. I get why we want to constraint it pod-level, but not allowing the user to constraint their user-container as much as they want (i.e. dropping capabilities etc.) seems quite arbitrary.

@dprotaso
Copy link
Member Author

@markusthoemmes That's why I made the issue - I'd say when the feature flag was made we were being conservative with properties.

Whatever we choose to open up we should make sure it's captured in the API spec

@evankanderson
Copy link
Member

/kind api-change

/triage accepted

/assign @evankanderson

@knative-prow-robot knative-prow-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API triage/accepted Issues which should be fixed (post-triage) labels Mar 22, 2021
@dprotaso
Copy link
Member Author

dprotaso commented Jun 12, 2021

related: #10812, #10814

/assign @dprotaso
/milestone 0.24.x

I'll revisit this for 0.24.x and determine if we should make some follow up issues.

@dprotaso
Copy link
Member Author

cc @julz @markusthoemmes @evankanderson

As part of the spec work I went over the container security context and figured it was worth adding RunAsGroup to compliment RunAsUser

Here are the allowable fields - note Capabilities is masked to only allow dropping capabilities

out.Capabilities = in.Capabilities
out.ReadOnlyRootFilesystem = in.ReadOnlyRootFilesystem
out.RunAsUser = in.RunAsUser
out.RunAsGroup = in.RunAsGroup
// Can only be set to 'true' - unless the PodSpecSecurityContext feature flag is enabled
if in.RunAsNonRoot != nil && *in.RunAsNonRoot {
out.RunAsNonRoot = in.RunAsNonRoot
}
if config.FromContextOrDefaults(ctx).Features.PodSpecSecurityContext != config.Disabled {
// Allow setting to false
out.RunAsNonRoot = in.RunAsNonRoot
}
// Disallowed
// This list is unnecessary, but added here for clarity
out.Privileged = nil
out.SELinuxOptions = nil
out.AllowPrivilegeEscalation = nil
out.ProcMount = nil

@dprotaso
Copy link
Member Author

dprotaso commented Sep 15, 2021

I think I'm content to leave PodSecurityContext disabled by default and close out this issue after the above PR.

Ideally we need finer grain control than Kubernetes currently offers. We ideally want distinct security contexts between our (queue-proxy etc.) and the users containers and volumes.

@dprotaso
Copy link
Member Author

Closing this out as the PR has merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Well-understood/specified features, ready for coding. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Issues which should be fixed (post-triage)
Projects
None yet
Development

No branches or pull requests

6 participants