-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add kep for better pruning in apply #810
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,273 @@ | ||
--- | ||
title: Kubectl Apply Prune Enhancement | ||
authors: | ||
- "@Liujingfang1" | ||
owning-sig: sig-cli | ||
participating-sigs: | ||
- sig-cli | ||
- sig-apps | ||
- sig-apimachinery | ||
reviewers: | ||
- "@pwittrock" | ||
- "@seans3" | ||
- "@soltysh" | ||
approvers: | ||
- "@pwittrock" | ||
- "@seans3" | ||
- "@soltysh" | ||
editor: | ||
- "Liujingfang1" | ||
- "crimsonfaith91" | ||
creation-date: 2019-02-04 | ||
last-updated: 2019-02-14 | ||
status: provisional | ||
see-also: | ||
- n/a | ||
replaces: | ||
- n/a | ||
superseded-by: | ||
- n/a | ||
--- | ||
|
||
# Kubectl Apply Prune Enhancement | ||
|
||
## Table of Contents | ||
* [Table of Contents](#table-of-contents) | ||
* [Summary](#summary) | ||
* [Motivation](#motivation) | ||
* [Goals](#goals) | ||
* [Non-Goals](#non-goals) | ||
* [Proposal](#proposal) | ||
* [Challenges](#challenges) | ||
* [Limitation of Alpha Prune](#limitation-of-alpha-prune) | ||
* [A Suggested Solution](#a-suggested-solution) | ||
* [Apply Config Object](#apply-config-object) | ||
* [Child Objects](#child-objects) | ||
* [How It Works](#how-it-works) | ||
* [Risks and Mitigations](#risks-and-mitigations) | ||
* [Graduation Criteria](#graduation-criteria) | ||
* [Implementation History](#implementation-history) | ||
* [Drawbacks](#drawbacks) | ||
* [Alternatives](#alternatives) | ||
|
||
[Tools for generating]: https://github.com/ekalinin/github-markdown-toc | ||
|
||
Link to tracking issue: [kubernetes/kubectl#572](https://github.com/kubernetes/kubectl/issues/572) | ||
|
||
## Summary | ||
|
||
`kubectl apply --prune` has been in Alpha for more than a year. This design proposes a better approach of handling `--prune` and move it to a second round of Alpha. This feature targets 1.15 release. | ||
|
||
## Motivation | ||
|
||
`kubectl apply` operates an aggregation of Kubernetes Objects. While `kubectl apply` is widely used in different workflows, users expect it to be able to sync the objects on the cluster with configuration saved in files, directories or repos. That means, when `kubectl apply` is run on those configurations, the corresponding objects are expected to be correctly created, updated and pruned. | ||
|
||
Here is an example workflow to demonstrate the meaning of syncing objects. In the directory `dir`, there are files declaring Kubernetes objects `foo`, `bar`. When the first time the directory is applied to the cluster by | ||
``` | ||
kubectl apply -f dir | ||
``` | ||
the objects `foo` and `bar` are created on cluster. Later, those resource files could be updated by users. Correspondingly the objects declared by those files could be updated, removed or even new objects being added. Assume the objects declared in the same directory `dir` becomes `foo`, `baz`. `bar` is deleted. After this modification, when | ||
``` | ||
kubectl apply --prune -f dir | ||
``` | ||
is run, the object `foo` in the cluster is expected to be updated to match the state defined in the resource file; the object `bar` in the cluster is expected to be removed and `baz` is expected to be created. | ||
|
||
Currently `kubectl apply --prune` is still in Alpha. It is based on label matching. Users need to be really careful to use this command, so that they don't delete objects unintentionally. There is a list of [issues](https://github.com/kubernetes/kubectl/issues/572) observed by users. The most noticeable issue happens when changing directories or working with directories with the same labels. When two directories share the same label. `kubectl apply --prune` from one directory will delete objects from the other directory. | ||
|
||
In this design, we take an overview of the pruning challenges and propose a better pruning solution. This new approach is able to work with or to be utilized by generic configuration management tools. | ||
|
||
### Goals | ||
|
||
Propose a solution for better pruning in `kubectl apply` so that | ||
- it can find the correct set of objects to delete | ||
- it can avoid mess up when changing directories | ||
- it can avoid selecting matching objects across all resources and namespaces | ||
|
||
### Non-Goals | ||
|
||
- Make `--prune` default on | ||
- Improvement on Garbage Collections | ||
|
||
## Proposal | ||
|
||
### Challenges | ||
The challenges of pruning are mainly from two aspects. | ||
- Find out correct objects to delete. When a directory is applied after some updates, it loses the track of objects that are previously applied from the same directory. | ||
- Delete objects at proper time. Assume the set of objects to be deleted is available, when to delete them is another challenge. Those objects could be still needed during rolling out other objects. One example is a ConfigMap object being referred in a Deployment object. When rolling update happens, new pods refer to new ConfigMap and old pods are still running and referring to old ConfigMap. The old ConfigMap can't be deleted until the rolling up successfully finishes. | ||
|
||
In this design, we focus on a solution for the first challenge. For the second challenge, we can blacklist certain types such as ConfigMaps and Secrets for deletion. Generally deleting objects at proper time can be better addressed by Garbage Collection. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hadn't thought of the second challenge before. This doesn't seem like a very principled way to handle it, and I think it only solves half of the problem. I think you also need to create transient versions of such objects. This would be a large change to the current apply semantic: apply always modifies things in place. The workflow where we make immutable versions of e.g. configmaps and never modify them is a different workflow that hasn't been built out yet. I think that would be interesting but it goes far beyond just not deleting things sometimes. So, I think for the first draft, you should just call the second item a non-goal and document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The second item is not a goal in the KEP. I'll document it. |
||
|
||
### limitation of Alpha prune | ||
|
||
The Alpha prune works by label selector: | ||
``` | ||
# Apply the configuration in manifest.yaml that matches label app=nginx and delete all the other resources that are not in the file and match label app=nginx. | ||
kubectl apply --prune -f manifest.yaml -l app=nginx | ||
``` | ||
Internally, `kubectl apply` takes following logic to perform pruning. | ||
- For one namespace, it queries all types of resources except [some hard coded ones](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/apply/apply.go#L632) | ||
to find objects that matches the label. The number of queries within one namespace is about the number of API resources installed in the cluster. | ||
- Among the matched objects, skip the ones without [last-applied-configuration annotation](https://kubernetes.io/docs/concepts/overview/object-management-kubectl/declarative-config/#how-apply-calculates-differences-and-merges-changes). | ||
- Among the rest objects, delete any object that is not present in the new apply set. | ||
- Repeat this for all visited namespaces. Overall, the number of queries is propotional to the number of API resources installed in the cluster. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. However, a filter operation is O(N) in the size of the collection. So, this is linear with the number of objects in the cluster, which is not good. |
||
|
||
The limitations are mainly from three aspects: | ||
- Objects might be deleted unintentionally when different directories use the same labels | ||
- Some types are hard-coded to avoid being pruned. | ||
- The query number is neither constant nor linear to the number of resources being applied/pruned. | ||
|
||
### A suggested solution | ||
|
||
#### Apply-Config Object | ||
To solve the first challenge, a concept of apply-config object is proposed. An apply-config object is an object that contains the GKN information for all the objects that are to be applied. The apply-config object is not generated by kubectl when applying is running. Instead, it should be generated by other tools such as kustomize, Helm. When generating resources to be consumed by `kubectl apply`, those tools are responsible to generate the associated apply-config object to utilize better pruning. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. GKN? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By GKN, I mean group, kind, name (and namespace for namespaced objects). Will update the doc to make it clear. |
||
|
||
An object is identified as an apply-config object when it has a special label `kubectl.kubernetes.io/apply-cfg: "true"`. The GKN information for all other objects are stored in annotations. Here is an apply-config object example. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not make a CRD for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't choose CRD for following reasons:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Liujingfang1 Why does this need to be a hard coded solution in kubectl? Is there supporting details, issues, or experience reports on this? I ask to understand the underlying user need. |
||
``` | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: name-foo | ||
labels: | ||
kubectl.kubernetes.io/apply-cfg: "true" | ||
annotations: | ||
kubectl.kubernetes.io/last-applied-objects: | | ||
namespace1:group1:kind1:somename | ||
namespace1:group2:kind2:secondname | ||
namespace3:group3:kind3:thirdname | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely wouldn't recommend using annotations for this data. Make a CRD with a proper schema, users can create one explicitly or (better) kubectl apply can construct it. |
||
``` | ||
Note that the version is not necessary to find the object. The version of the resource can be selected by the client and available versions can be discovered dynamically | ||
|
||
The apply-config object is not limited to ConfigMap type. Any type of Object can be an apply-config object as long as it has the `apply-cfg` label. That means a custom resource object, such as an [Application](https://github.com/kubernetes-sigs/application) custom resource object, can also be used as an apply-config object when needed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you want this property: |
||
|
||
The apply-config object could have an optional hash label `kubectl.kubernetes.io/apply-hash`, which is based on the content of annotation `kubectl.kubernetes.io/last-applied-objects`. This hash label is used for validation in kubectl. More details are in [How It Works](#how-it-works). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This annotation is going away. If you say it "could" have something, it's useful to say why it might be used (validation of what?). Or just defer discussion completely to the section mentioned. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will move it to later parts. |
||
|
||
#### Child Objects | ||
The objects passed to apply will have annotations pointing back to the apply-config object for the same apply. | ||
``` | ||
annotations: | ||
kubectl.kubernetes.io/apply-owner: uuid_of_the_apply_cfg_object | ||
``` | ||
The presence of this annotation is to confirm that a live object is safe to delete when it is to be pruned. A child object may be updated or touched by other users or apply owner, in which case the annotation will be missing or different. In this scenario, it is safe to not delete the child object. This annotation is used to prevent deleting child objects in this scenario. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you always make this a separate object, you could use the regular GC relationship to represent this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean the owner-reference. Yes, we can add it to the owner-reference list instead of using a |
||
|
||
#### How it works | ||
Since the apply-config object is not generated by `kubectl`, other tools or users need to generate an apply-config object configuration before piping them into `kubectl apply`. For example, in following workflows, `kustomize` or `helm` need to generate the apply-config object. Moreover, every time those tools run, it should generate the apply-config object with the same name. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you should consider actually generating this object. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We considered generating this object in
|
||
``` | ||
kustomize build . | kubectl apply -f - | ||
# or | ||
helm template mychart | kubectl apply -f - | ||
``` | ||
|
||
When `kubectl apply` is run, it looks for an apply-config object by label `kubectl.kubernetes.io/apply-cfg: "true"`. If no apply-config object is found, `kubectl` will give a warning that no apply-config object is found. When an apply-config object is found, the pruning action is triggered. If the label `kubectl.kubernetes.io/apply-hash` is present in the apply-config object, `kubectl` will validate this hash against the content stored in annotation. This validation is to prevent unintentional modification of the apply-config object annotations from users. Once the validation passes, the next step is to query the live apply-config object from the cluster. Assume the apply-config object has kind `K` and name `N`, `kubectl` looks for a live object from the cluster with the same kind and name. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand what the hash is of, what it is compared to, and what failure this protects against. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Basically, it is used to make sure the list in the apple-cfg objects matches the objects to be applied. Will revise the explanation here to make it clear. |
||
- If a live apply-config object is not found, Kubectl will create the apply-config object and the corresponding set of objects | ||
- If a live apply-config object is found, then the set of live objects will be found by reading the annotation field of the live apply-config object. Kubectl diffs the live set with new set of objects and takes following steps: | ||
- set the apply-config object annotation as the union of two sets and update the apply-config object | ||
- create new objects | ||
- prune old objects | ||
- remove the pruned objects from apply-config object’s annotation field | ||
- validate the `apply-hash` in apply-config object as a signal that pruning has successfully finished | ||
|
||
|
||
pros: | ||
- The apply-config object can be any type of objects | ||
- No need to selecting matching objects by labels. More efficient | ||
- Number of queries is linear to the number of resources being applied/pruned | ||
|
||
cons: | ||
- Users need be careful when modifying resource files so that the label `kubectl.kubernetes.io/apply-cfg: "true"` is not | ||
removed unintentionally. | ||
- Users/tools need to maintain the correct list of objects in annotations | ||
|
||
|
||
#### Risks and Mitigations | ||
|
||
Deciding the correct set of objects to change is hard. `--prune` Alpha tended to solve that problem in kubectl itself. In this new approach, kubectl is not going to decide such set. Instead it delegates the job to users or tools. Those who need to use `--prune` will be responsible to create an apply-config object as a record of the correct set of objects. Kubectl consumes this apply-config object and triggers the pruning action. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my view, it is actually kubectl's job to track the set, and it's not a good user experience for them to have to list their objects twice. The problem is maintaining identity between the prior and current invocations. I think the idea of storing a list of which objects were created is part of the right answer. Perhaps instead of making the user write out a list, you can make the user include a placeholder object (and offer to create it)? Really the only thing you need the user to provide is a name which will stay stable between invocations. On every apply, you do something like:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we agree on the idea of storing a list somewhere.
Yes, we need a name staying stable. In addition to the name, we also need to know which types users don't want to prune, such as configmap/secret. Those types have to be listed out in the object storing the apply list. So the minimal information provided from users is an object name and a list of types. If users or tools can provide a list types that should be excluded from pruning, providing a full list of objects(excluding the types such as configmap/secret) doesn't cause much more work. |
||
|
||
This approach requires users/tools to do more work than just specifying a label. which might discourage the usage of `--prune`. On the other side, existing tools such as Kustomize, Helm can utilize this approach by generating an apply-config object when expanding or generating the resources. | ||
|
||
## Graduation Criteria | ||
`kubectl apply` with pruning can be moved from Alpha to Beta when | ||
|
||
- There is no severe issue with the new approach. | ||
|
||
- Pruning is widely used in CI workflow with Kubernetes. | ||
|
||
## Implementation History | ||
|
||
The major implementation includes following steps: | ||
|
||
- add functions to identify and validate apply-config object | ||
- add a different code path to handle the case when apply-config object is present | ||
- add unit test | ||
- add e2e tests | ||
- update `kubectl apply` document and examples | ||
|
||
As a convenient way for users to view the intention of apply, `--dry-run` will show categorized output: | ||
``` | ||
$ kubectl apply -f <dir> --dry-run | ||
Resources to add | ||
pod1... | ||
pod2... | ||
Resources modified | ||
pod3... | ||
pod4... | ||
Resources unmodified | ||
pod5... | ||
pod6... | ||
Resources to delete | ||
pod7... | ||
``` | ||
|
||
## Drawbacks | ||
|
||
- This approach is not backward compilable with Alpha `--prune` | ||
- The apply-config object can't be auto pruned. It need manual clean up. | ||
|
||
## Alternatives | ||
|
||
### Using a unique identifier as label | ||
Instead of listing all the child objects in the apply-config object, a label could be set in the apply-config object. | ||
``` | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: predictable-name-foo | ||
labels: | ||
kubectl.kubernetes.io/apply-confg: "true" | ||
kubectl.kubernetes.io/apply-id: some_uuid | ||
``` | ||
|
||
All child objects have this label. Then `kubectl apply --prune` is run, the live set of child objects is found by label selecting and annotation matching. | ||
``` | ||
labels: | ||
kubectl.kubernetes.io/apply-id: some_uuid | ||
annotations: | ||
kubectl.kubernetes.io/apply-owner: uuid_of_the_apply_config_object | ||
``` | ||
Any object in the live set that is not in the new apply set will be deleted. | ||
|
||
The label set in the apply-config object should not have collisions with resources outside of the aggregation identified by the apply-config objects. This will require users or tools create a unique identifier. | ||
- pros: | ||
- This extends the idea of Alpha `--prune` with an apply-config object; existing `--prune` users can easily adopt this new approach. | ||
- This could be used by other functionality of kubectl, such as checking if the applied objects are settled | ||
- This can work for Application CRD or other CRDs that with an apply-config object concept | ||
- No need to maintain a list of child objects and store it in the apply-config object | ||
- cons: | ||
- need a unique label that doesn't collision with other resources; may need other tool to mechanism to generate a unique label | ||
- selecting matching objects across namespaces and types is not efficient | ||
- The unique identifier may not be human-readable | ||
|
||
### Handling prune on the server side | ||
Currently, some logic in `kubectl apply` has started to be moved to the server side. `kubectl apply` performs 3-way merge for resolving state differences before sending the state to API server. With alpha release of [server-side `apply`](https://github.com/kubernetes/enhancements/issues/555), this 3-way merge is going to be replaced. Similar to that, the pruning logic in `kubectl apply` can also be moved to the server side. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cross-object apply was explicitly out of scope. apiserver doesn't currently do any cross-object operations. So, I don't actually see how this can be done (without some radical apiserver scope-creep). Additionally, the hard problem is tracking which apply invocation is the successor to which other apply invocation; moving the feature to the control plane doesn't address that problem at all, you still need some solution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another alternative is to still host the pruning logic in server side, but outside of |
||
|
||
-pros: | ||
- All other CLIs such as Helm client to enjoy the benefits of the `prune` feature on server side | ||
- The server side can assign owner-reference, which helps cleaning up | ||
- With introduction of server-side apply for single object, it makes sense to host logic of applying operations for | ||
a batch of objects in the API server. Then the server side will have a complete story for applying objects. | ||
|
||
-cons: | ||
- No capability to support working on a batch of objects. Need to have an API to represent an aggregation of objects. | ||
- longer development cycle than pure client side pruning. This is required due to necessity of API design and approval, | ||
subsequent controller design, implementation and maintenance, and possibly convoluting RBAC setup. This may take up to 2 years. On the contrary, client-side pruning is achievable within one release cycle(The proposed solution is targeting 1.15). | ||
- Will increase the loads of API-server | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the controversy around this functionality and the fact this is still an alpha feature I'd strongly recommend we do not make this default on. The biggest issue atm is that we're transferring the responsibility to create that config file to a user (or tool), yet based on its presence we want to always prune.