-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
design: improved service account token volumes
- Loading branch information
1 parent
1f79ecc
commit ed59dd4
Showing
1 changed file
with
154 additions
and
0 deletions.
There are no files selected for viewing
154 changes: 154 additions & 0 deletions
154
contributors/design-proposals/auth/svcacct-token-volume-source.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
# Service Account Token Volumes | ||
|
||
Authors: | ||
@smarterclayton | ||
@liggitt | ||
@mikedanese | ||
|
||
## Summary | ||
|
||
Kubernetes is able to provide pods with unique identity tokens that can prove | ||
the caller is a particular pod to a Kubernetes API server. These tokens are | ||
injected into pods as secrets. This proposal proposes a new mechanism of | ||
distribution with support for [improved service account | ||
tokens](https://github.com/kubernetes/community/pull/1460) and explores how to | ||
migrate from the existing mechanism backwards compatibly. | ||
|
||
## Motivation | ||
|
||
Many workloads running on Kubernetes need to prove to external parties who they | ||
are in order to participate in a larger application environment. This identity | ||
must be attested to by the orchestration system in a way that allows a third | ||
party to trust that an arbitrary container on the cluster is who it says it is. | ||
In addition, infrastructure running on top of Kubernetes needs a simple | ||
mechanism to communicate with the Kubernetes APIs and to provide more complex | ||
tooling. Finally, a significant set of security challenges are associated with | ||
storing service account tokens as secrets in Kubernetes and limiting the methods | ||
whereby malicious parties can get access to these tokens will reduce the risk of | ||
platform compromise. | ||
|
||
As a platform, Kubernetes should evolve to allow identity management systems to | ||
provide more powerful workload identity without breaking existing use cases, and | ||
provide a simple out of the box workload identity that is sufficient to cover | ||
the requirements of bootstrapping low-level infrastructure running on | ||
Kubernetes. We expect that other systems to cover the more advanced scenarios, | ||
and see this effort as necessary glue to allow more powerful systems to succeed. | ||
|
||
With this feature, we hope to provide a backwards compatible replacement for | ||
service account tokens that strengthens the security and improves the | ||
scalability of the platform. | ||
|
||
## Proposal | ||
|
||
Kubernetes should: | ||
|
||
1. Allow the automatic service account token injection to be disabled via | ||
configuration. | ||
1. Allow automatic service account token creation to occur only when a user | ||
explicitly requests. | ||
1. Implement a ServiceAccountToken volume projection that that maintains a | ||
service account token requested by the node from the TokenRequest API. | ||
|
||
### Token Volume Projection | ||
|
||
A new volume projection will be implemented with an API that closely matches the | ||
TokenRequest API. | ||
|
||
```go | ||
type ProjectedVolumeSource struct { | ||
Sources []VolumeProjection | ||
DefaultMode *int32 | ||
} | ||
|
||
type VolumeProjection struct { | ||
Secret *SecretProjection | ||
DownwardAPI *DownwardAPIProjection | ||
ConfigMap *ConfigMapProjection | ||
ServiceAccountToken *ServiceAccountTokenProjection | ||
} | ||
|
||
type ServiceAccountTokenProjection struct { | ||
// Audiences are the intendend audiences of the token. | ||
Audiences []string | ||
// ExpirationSeconds is the requested duration of validity of the service | ||
// account token | ||
ExpirationSeconds *int64 | ||
// Path is the relative path of the file to project the token into. | ||
Path string | ||
} | ||
``` | ||
|
||
A volume plugin implemented in the kubelet will project a service account token | ||
sourced from the TokenReview API into volumes created from | ||
ProjectedVolumeSources. As the token approaches expiration, the kubelet volume | ||
plugin will proactively rotate the service account token. | ||
|
||
To replace the current service account token secrets, we also need to inject the | ||
clusters CA certificate bundle. Initially we will deploy to data in a configmap | ||
per-namespace and reference it using a ConfigMapProjection. | ||
|
||
This fixes one scalability issue with the current service account token | ||
deployment model where secret GETs are a large portion of overall apiserver | ||
traffic. However it doesn't solve the per-namespace replicated CA data which | ||
needs to be stored in ETCD. | ||
|
||
To fix that issue we can extend ConfigMapProjection to support cross namespace | ||
configmap references. Initially, we can hardcode a rule int the NodeAuthorizer | ||
that allows nodes and pods to reference a specific configmap in the kube-public | ||
namespace that contains the root ca.crt data with the goal of improving the | ||
permission model to support cross namespace configmap references generally. See | ||
[#4957](https://github.com/kubernetes/kubernetes/issues/4957). | ||
|
||
A projected volume source that is equivalent to the current service account | ||
secret: | ||
|
||
```yaml | ||
sources: | ||
- serviceAccountToken: | ||
expirationSeconds: 3153600000 # 100 years | ||
path: token | ||
- configMap: | ||
name: kube-cabundle | ||
namespace: kube-public | ||
items: | ||
- key: ca.crt | ||
path: ca.crt | ||
``` | ||
To phase out service account token secrets: | ||
* The ServiceAccount admission controller will migrate to ProjectedVolumeSources | ||
instead of SecretVolumeSources. | ||
* The TokenController will be deprecated and turned down. | ||
This process will likely take ~year of releases to complete. | ||
### Risks and Mitigations | ||
Reducing the scope of service account tokens by not creating them automatically | ||
on service account creation is technically an API break. This would have to be | ||
opt-in, but like RBAC is about reducing the scope of vulnerability. Many people | ||
may opt not to disable it. For those who do disable it, we can preserve the | ||
existing behavior of being able to create a secret of type service-account-token | ||
(with the annotation to the service account that links to it) and have the | ||
controller auto populate it. | ||
FlexVolume and CSI are the only way to deliver custom content to nodes today | ||
without persisting it in the API. In a virtual kubelet environment, these | ||
mechanisms may not work the same as on regular kubelets, so third party identity | ||
integrators may not be able to deliver their custom content. A container volume | ||
or init container might be a sufficient workaround. | ||
### Alternatives | ||
1. Instead of implementing a service account token volume projection, we could | ||
implement all injection as a flex volume or CSI plugin. | ||
1. Both flex volume and CSI are alpha and are unlikely to graduate soon. | ||
1. Virtual kubelets (like Fargate or ACS) may not be able to run flex | ||
volumes. | ||
1. Service account tokens are a fundamental part of our API. | ||
1. Remove service accounts and service account tokens completely from core, use | ||
an alternate mechanism that sits outside the platform. | ||
1. Other core features need service account integration, leading to all | ||
users needing to install this extension. | ||
1. Complicates installation for the majority of users. |