Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PipelineRun validation error for params without a type #4258

Closed
skaegi opened this issue Sep 24, 2021 · 3 comments · Fixed by #4269
Closed

PipelineRun validation error for params without a type #4258

skaegi opened this issue Sep 24, 2021 · 3 comments · Fixed by #4269
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@skaegi
Copy link
Contributor

skaegi commented Sep 24, 2021

After updating from 0.26.0 to 0.27.3 we're seeing the following error...

{
   "level":"error",
   "ts":"2021-09-17T04:42:05.404Z",
   "logger":"tekton-pipelines-controller",
   "caller":"pipelinerun/pipelinerun.go:242",
   "msg":"Reconcile error: invalid value: : finally.params[cluster].type, finally.params[repository].type, finally.params[revision].type, tasks.params[cluster].type, tasks.params[repository].type, tasks.params[revision].type",
   "commit":"e297768",
   "knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.pipelinerun.Reconciler",
   "knative.dev/kind":"tekton.dev.PipelineRun",
   "knative.dev/traceid":"8020c9bf-b9cb-44b3-9c61-f5639ea7f1ea",
   "knative.dev/key":"pw-b78c8159-8a90-48c5-ad37-e8cb5f16f9f3/pipelinerun-b78c8159-8a90-48c5-ad37-e8cb5f16f9f3",
   "stacktrace":"github.com/tektoncd/pipeline/pkg/reconciler/pipelinerun.(*Reconciler).ReconcileKind\n\tgithub.com/tektoncd/pipeline/pkg/reconciler/pipelinerun/pipelinerun.go:242\ngithub.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/pipelinerun.(*reconcilerImpl).Reconcile\n\tgithub.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/pipelinerun/reconciler.go:246\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:540\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).RunContext.func3\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:477"
}

* NICELY FORMATTED STACKTRACE *
github.com/tektoncd/pipeline/pkg/reconciler/pipelinerun.(*Reconciler).ReconcileKind
	github.com/tektoncd/pipeline/pkg/reconciler/pipelinerun/pipelinerun.go:242
github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/pipelinerun.(*reconcilerImpl).Reconcile
	github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1beta1/pipelinerun/reconciler.go:246
github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).processNextWorkItem
	github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:540
github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).RunContext.func3
	github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:477

The pipeline in question here declares a "cluster", "repository", and "revision" param but does not specify a "type" for these params as the default type has worked until now. We've found that if we alter the Pipeline definition to specify type: string for these params everything works correctly.

Reporting now in the hope someone can think of some difference between 0.26 and 0.27.x , but we're trying to create a good reproducer...
The problem occured in a moderately complex pipeline. We've tried a super simple hello world pipeline where we did not declare a type on a param but it worked. That leads us to believe that perhaps there is a race condition between when the default type is set and someone checks it?? We have so far only seen this in our Openshift 4.7 (RHEL7) cluster and not our vanilla Kubernetes cluster but that might not be ultimately relevant.

@skaegi skaegi added the kind/bug Categorizes issue or PR as related to a bug. label Sep 24, 2021
@skaegi
Copy link
Contributor Author

skaegi commented Sep 29, 2021

Here's the simplest example that fails for me...

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: deploy-with-kubectl
spec:
  params:
    - name: cluster
      description: the cluster
  steps:
    - name: deploy
      image: alpine
      env:
        - name: CLUSTER
          value: $(params.cluster)
      command: ["/bin/sh", "-c"]
      args:
        - set -e -o pipefail;
          echo "Deploying to $CLUSTER";
          sleep 1m;
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: deploy-pipeline
spec:
  params:
    - name: cluster
      description: the cluster to deploy to
  tasks:  
    - name: deploy-with-kubectl
      taskRef:
        name: deploy-with-kubectl
      params:
        - name: cluster
          value: $(params.cluster)
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipelinerun-abc
spec:
  pipelineRef:
      name: deploy-pipeline
  params:
  - name: cluster
    value: abc123

What I've found is that in any version "after" 0.26.0 when I apply I will get...

$ k get pr
NAME              SUCCEEDED   REASON                     STARTTIME   COMPLETIONTIME
pipelinerun-abc   False       PipelineValidationFailed   9s          9s

If I do a get -oyaml I see...

status:
  completionTime: "2021-09-29T02:33:49Z"
  conditions:
  - lastTransitionTime: "2021-09-29T02:33:49Z"
    message: 'Pipeline default/deploy-pipeline can''t be Run; it has an invalid spec:
      invalid value: : finally.params[cluster].type, tasks.params[cluster].type'
    reason: PipelineValidationFailed
    status: "False"
    type: Succeeded

I've now verified that I get the same result with 0.27.0, 0.27.2, 0.27.3, and 0.28.0

I use different flavors of Kubernetes but all version 1.20 and am only seeing this with openshift 4.7.x (currently using 4.7.30 but saw this with 4.7.21 too). I'm truly at a loss as to what's changed between 0.26.0 and 0.27.0 that could cause this.

@afrittoli
Copy link
Member

I can reproduce the issue.
From the webhook logs, this looks suspicious:

{"level":"error","ts":"2021-09-30T20:58:20.804Z","logger":"tekton-pipelines-webhook.DefaultingWebhook","caller":"controller/controller.go:579","msg":"Reconcile error","commit":"f8c2eea","duration":0.053153009,"error":"failed to update webhook: mutatingwebhookconfigurations.admissionregistration.k8s.io \"webhook.pipeline.tekton.dev\" is forbidden: cannot set an ownerRef on a resource you can't delete: , <nil>","stacktrace":"github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).handleErr\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:579\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:556\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).RunContext.func3\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:491"}

If I create a Task and try to fetch it, not type is set on the params, so the defaulting webhook is not doing its work.

@afrittoli
Copy link
Member

With this change in:

/go/src/github.com/tektoncd/pipeline/config (main *)$ git diff
diff --git a/config/200-clusterrole.yaml b/config/200-clusterrole.yaml
index 420411b3a..1926be791 100644
--- a/config/200-clusterrole.yaml
+++ b/config/200-clusterrole.yaml
@@ -87,7 +87,7 @@ rules:
     resourceNames: ["webhook.pipeline.tekton.dev"]
     # When there are changes to the configs or secrets, knative updates the mutatingwebhook config
     # with the updated certificates or the refreshed set of rules.
-    verbs: ["get", "update"]
+    verbs: ["get", "update", "delete"]
   - apiGroups: ["admissionregistration.k8s.io"]
     resources: ["validatingwebhookconfigurations"]
     # validation.webhook.pipeline.tekton.dev performs schema validation when you, for example, create TaskRuns.

I now get a different error:

{"level":"error","ts":"2021-09-30T21:58:56.608Z","logger":"tekton-pipelines-webhook.DefaultingWebhook","caller":"controller/controller.go:579","msg":"Reconcile error","commit":"f8c2eea","duration":0.049220866,"error":"failed to update webhook: mutatingwebhookconfigurations.admissionregistration.k8s.io \"webhook.pipeline.tekton.dev\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","stacktrace":"github.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).handleErr\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:579\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:556\ngithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller.(*Impl).RunContext.func3\n\tgithub.com/tektoncd/pipeline/vendor/knative.dev/pkg/controller/controller.go:491"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants