Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run vs Definition "race" and "source of truth": PipelineRun errors out after Pipeline changed #3916

Closed
vdemeester opened this issue May 6, 2021 · 2 comments · Fixed by #3941
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vdemeester
Copy link
Member

Expected Behavior

Once a PipelineRun starts, it does continue to the end (if success) no matter what happens to the referenced Pipeline (a change, …).

Actual Behavior

If something happens to the Pipeline that is referred by the PipelineRun, the controller might fail the execution because incompatibilities, missing resource/parameter/… or even missing Pipeline (or Task).

Steps to Reproduce the Problem

  1. Create a PipelineRun referencing a Pipeline with several Task (that take enough time for you to mutate the Pipeline)
  2. Update the Pipeline with an additionnal param and an additonal Task that uses that param
  3. See it fail on that new Task because of missing param

An even easier case to reproduce is

  1. Create a PipelineRun referencing a Pipeline with several Task (that take enough time for you to mutate the Pipeline)
  2. Delete the Pipeline
  3. See the PipelineRun fail with CouldntGetPipeline error

I suspect the problem to be true even when using Tekton Bundle, if the tag changed between 2 reconcile, we may fetch the definition again (which is a performance problem, but this is kinda tracked in #3305.

Additional Info

  • Kubernetes version:

Any

  • Tekton Pipeline version:

Any

Possible solution

We are dereferencing the definitions in the status part of PipelineRun (and TaskRun). This could be used as the "source-of-truth" after the first reconciling "loop" and we wouldn't fetch the Pipeline definition anymore.

This means:

  • Making sure we do dereference definition from Tekton Bundles
  • Change the behavior of fetching Pipeline definition to see if it's available in the status. If not, fetch the Pipeline definition, if not, use the definition in the status as the source of truth.

/assign

@vdemeester vdemeester added the kind/bug Categorizes issue or PR as related to a bug. label May 6, 2021
@vdemeester vdemeester added this to the Pipelines v0.25 milestone May 6, 2021
@ghost
Copy link

ghost commented May 7, 2021

I wonder if this should be configuration setting? Some users seem to want less data to be stored in their PipelineRun statuses (e.g. #3140) while others would probably want the reproducibility / performance benefits from the spec being cached in this way.

It sounds like the logic described here would work for both cases: if the user configures tekton not to cache the specs in the run status then it would fall back to querying the cluster for the resource.

@ghost
Copy link

ghost commented May 18, 2021

Question from owners meeting just now: Should we be fetching / caching the Spec when a TaskRun or PipelineRun is created in pending state at the moment that the TR / PR is created? Right now (I think) the spec won't be fetched until we move out of the pending state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant