Delete old resources during upgrade when a component no longer includes them #374

ghost · 2021-08-04T13:15:40Z

Feature request

When a component removes a resource from their release build it would be great to have that resource removed during operator's upgrade cycle.

Use case

In Tekton Pipelines 0.19 we introduced a PodDisruptionBudget object to try and support "high availability" out of the box. Later, in 0.22, we removed the PDB because it became apparent that it could block node draining in a user's cluster. Pipeline's release.yaml changed from one release to the next - first including the PDB object and then excluding it.

If a user had installed 0.19 then later upgraded to 0.22 the PDB would stick around in the tekton-pipelines namespace. This is because running kubectl apply -f release.yaml doesn't perform any kind of pruning. There are two options that I'm aware of atm to work around this:

There's an alpha flag, --prune, which might support this? I can't tell precisely but it sounds like what I'm describing here.
There's an approach where the user first runs kubectl delete -f <previous release.yaml> and then runs kubectl apply -f <new release.yaml>. I guess this results in a short amount of downtime while the resources turn down and then turn up, but it certainly seems like it'll be precise when cleaning up.

I'm not sure what the best approach for the Operator to take is - maybe there's a way to diff the new version with the resources in the targetNamespace or something like that?

The text was updated successfully, but these errors were encountered:

nikhil-thomas · 2021-08-09T07:52:33Z

This is definitely something we need as part of "seamless upgrades" story.

There's an approach where the user first runs kubectl delete -f and then runs kubectl apply -f . I guess this results in a short amount of downtime while the resources turn down and then turn up, but it certainly seems like it'll be precise when cleaning up.

This is definitely a feasible approach. However, we will have to make sure that we don't delete CRDs, as we don't want all Pipeline, Triggers CRD instances to be lost. I assume it will be totally safe to remove all the namespaced resources in tekton-pipelines and then recreate them. We can even consider settting up a new namespace with new version of a component and then deleting the previous namespace, like a node replace in a kubernetes upgrade (here, my imagination did go a little wild :D ).

Cleaning up ClusterScoped resource could be a handled separately. One approach could be, instead of using the instance of TektonConfig as the ownerReference of all the resources created/managed by the operator; we can create a configMap with some data pertaining to an update (version, date etc) and use that as the owner reference for all the resources created by the operator. When a new upgrade happens we can create a new configMap and update the ownerReference in all the resources that are still relevant. Then we can delete the old "owner" configMap and all the obsolete resources should be garbage collected 😇 . (TektonConfig could be the ownerReference for the "owner" configMap).

nikhil-thomas · 2021-08-09T07:53:06Z

cc @sm43

nikhil-thomas · 2021-10-08T07:51:14Z

The following PRs have fixed this:

In short, a new CRD, TektonInstallerSet, is introduced. It decouples the ownerreference of all created resources from top level CRDs like TektonPipeline and sets the owner refereces to instalces of TektonInstallerSet. For each upgrade a new InstallerSet is created and the old one will be deleted.

In effect, all resources except CRDs (Pipeline, Tasks, Triggers ...) will be deleted with the old TektonInstallerSet instance and recreated from the upgraded manifest by creating a new TektonInstallerSet instance. This will ensure that obsolete resources are removed automatically with each upgrade, as the TektonInstallerSet instance owning them gets deleted in the upgrade.

Note: This idea is similar to the one mentioned in the comment above. TektonInstallerSet CRD is more like a value added "Clusterscoped ConfigMap" with its own controller.

nikhil-thomas · 2021-10-08T07:52:29Z

/close

tekton-robot · 2021-10-08T07:52:31Z

@nikhil-thomas: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ghost added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 4, 2021

tekton-robot closed this as completed Oct 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete old resources during upgrade when a component no longer includes them #374

Delete old resources during upgrade when a component no longer includes them #374

ghost commented Aug 4, 2021

nikhil-thomas commented Aug 9, 2021

nikhil-thomas commented Aug 9, 2021

nikhil-thomas commented Oct 8, 2021

nikhil-thomas commented Oct 8, 2021

tekton-robot commented Oct 8, 2021

Delete old resources during upgrade when a component no longer includes them #374

Delete old resources during upgrade when a component no longer includes them #374

Comments

ghost commented Aug 4, 2021

Feature request

Use case

nikhil-thomas commented Aug 9, 2021

nikhil-thomas commented Aug 9, 2021

nikhil-thomas commented Oct 8, 2021

nikhil-thomas commented Oct 8, 2021

tekton-robot commented Oct 8, 2021