Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in Reconciler #1075

Closed
abergmeier opened this issue Jul 15, 2019 · 14 comments · Fixed by #1085
Closed

SIGSEGV in Reconciler #1075

abergmeier opened this issue Jul 15, 2019 · 14 comments · Fixed by #1085
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@abergmeier
Copy link

Expected Behavior

Should run fine

Actual Behavior

Leads to a segmentation violation.

Steps to Reproduce the Problem

  1. Created TaskRun on OpenShift 3.11
apiVersion: tekton.dev/v1alpha1
kind: TaskRun
metadata:
  name: git-mirror-run
spec:
  serviceAccount: git-mirror
  taskRef:
    name: git-mirror-task
  inputs:
    params:
      - name: srcRepo
        value: ssh://[email protected]/foo/bar.git
      - name: destRepo
        value: [email protected]:foo/bar_common.git

Additional Info

Logs print:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x134621b]

goroutine 204 [running]:
github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun.(*Reconciler).checkTimeout(0xc0004956c0, 0xc00037a280, 0xc000940c30, 0xc0009012f8, 0xc00090f820, 0x0, 0x69e489)
	/workspace/go/src/github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun/taskrun.go:536 +0x9b
github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun.(*Reconciler).reconcile(0xc0004956c0, 0x19aa620, 0xc0007f1d10, 0xc00037a280, 0xed4be7aee, 0x27a75a0)
	/workspace/go/src/github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun/taskrun.go:264 +0x3fd
github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun.(*Reconciler).Reconcile(0xc0004956c0, 0x19aa620, 0xc0007f1d10, 0xc000bdc780, 0x1f, 0xc00097e8a8, 0x19aa620)
	/workspace/go/src/github.com/tektoncd/pipeline/pkg/reconciler/v1alpha1/taskrun/taskrun.go:171 +0x58b
github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).processNextWorkItem(0xc00059d200, 0x175be00)
	/workspace/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:330 +0x514
github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).Run.func1(0xc00089c7f4, 0xc00059d200)
	/workspace/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:282 +0x51
created by github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).Run
	/workspace/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:280 +0x1a5
@vdemeester
Copy link
Member

@abergmeier which version of tekton is running ?

@abergmeier
Copy link
Author

abergmeier commented Jul 15, 2019

Tried both 0.5.0 and 0.5.1.
So the problem should be in tr.Spec.Timeout.Duration, right? I neither use taskSpec nor specify a Timeout anywhere.

@vdemeester
Copy link
Member

Interesting… 🤔 Was that TaskRun created before 0.5.x or after ?

Somehow you are running into a case that shouldn't happen… as Timeout.Duration should never be nil at that point. The only case it can happen are :

  • The TaskRun is older, thus it's not present … I feel we will need to fix that (and backport it 😅)
  • The webhook didn't make his work (which I am really doubtful about it)

@abergmeier
Copy link
Author

abergmeier commented Jul 15, 2019

TaskRun was created with 0.5.0.

The webhook didn't make his work (which I am really doubtful about it)

Care to elaborate how a webhook should help here?

@vdemeester
Copy link
Member

@abergmeier the webhook sets default values, and in there, the default, configured, timeout for all TaskRun. So that Timeout field should never be nil post 0.5.0.

Did you have pre-existing TaskRun before deploying 0.5.0 ? (aka did you update the controller with already existings TaskRuns)

@abergmeier
Copy link
Author

@abergmeier the webhook sets default values, and in there, the default, configured, timeout for all TaskRun. So that Timeout field should never be nil post 0.5.0.

Without knowing the technical details - this sounds brittle to me.

Did you have pre-existing TaskRun before deploying 0.5.0 ? (aka did you update the controller with already existings TaskRuns)

Not at all - this is a fresh cluster :)

@vdemeester
Copy link
Member

/kind bug
/assign

@tekton-robot tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 15, 2019
@vdemeester vdemeester added this to the Pipelines 0.5 🐱 milestone Jul 15, 2019
@carltonmason
Copy link

FYI, I ran into the exact same issue in the same place an hour ago with taskRun when following the https://github.com/tektoncd/pipeline/blob/master/docs/tutorial.md.

@vdemeester vdemeester added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jul 16, 2019
@vdemeester
Copy link
Member

@carltonmason thanks 👼 which version of kubernetes (and/or OpenShift) are you using ?

@carltonmason @abergmeier can you display what kubectl get mutatingwebhookconfiguration says ?

@abergmeier
Copy link
Author

can you display what kubectl get mutatingwebhookconfiguration says ?

$ oc get mutatingwebhookconfiguration
NAME                 CREATED AT
webhook.tekton.dev   2019-07-12T12:50:53Z

@vdemeester
Copy link
Member

@abergmeier @carltonmason are you using Task or ClusterTask for those ?

@abergmeier
Copy link
Author

are you using Task or ClusterTask for those ?

Task

@carltonmason
Copy link

I am using Tekton 0.5.1 release on Mac and I am using Task.

ckmason@carltons-mbp:tekton-learn$ minishift version
minishift v1.33.0+ba29431
ckmason@carltons-mbp:tekton-learn$ oc version
oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0
features: Basic-Auth

Server https://192.168.64.4:8443
kubernetes v1.11.0+d4cacc0

oc get mutatingwebhookconfiguration
NAME                 CREATED AT
webhook.tekton.dev   2019-07-15T21:20:34Z

@abergmeier
Copy link
Author

Thanks for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants