Skip to content

Commit

Permalink
Emit events for all TaskRun lifecycle events
Browse files Browse the repository at this point in the history
Emit events for additional TaskRun lifecyle events:
- taskrun started
- taskrun running
- taskrun timeout

Fix the logic in events.go to compare semantic equality as
opposed to raw pointer equality.
Fix broken EmitEvents unit tests and extend them to cover new
functionality.

Extend reconcile test to verify new events are sent. To do so,
get the event recorder from the context when creating the
controller - if avaialble. This allows using the fake recorder
for testing instead of having to look for event related actions
in the fake client go action list.

Add documentation on events.

Fixes #2328
Work towards #2082
  • Loading branch information
afrittoli authored and tekton-robot committed May 1, 2020
1 parent 2cb262a commit c76a6ad
Show file tree
Hide file tree
Showing 9 changed files with 339 additions and 86 deletions.
39 changes: 39 additions & 0 deletions docs/events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!--
---
linkTitle: "Events"
weight: 2
---
-->
# Events

Tekton runtime resources, specifically `TaskRuns` and `PipelineRuns`,
emit events when they are executed, so that users can monitor their lifecycle
and react to it. Tekton emits [kubernetes events](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#event-v1-core), that can be retrieve from the resource via
`kubectl describe [resource]`.

No events are emitted for `Conditions` today (https://github.com/tektoncd/pipeline/issues/2461).

## TaskRuns

`TaskRun` events are generated for the following `Reasons`:
- `Started`: this is triggered the first time the `TaskRun` is picked by the
reconciler from its work queue, so it only happens if web-hook validation was
successful. Note that this event does not imply that a step started executing,
as several conditions must be met first:
- task and bound resource validation must be successful
- attached conditions must run successfully
- the `Pod` associated to the `TaskRun` must be successfully scheduled
- `Succeeded`: this is triggered once all steps in the `TaskRun` are executed
successfully, including post-steps injected by Tekton.
- `Failed`: this is triggered if the `TaskRun` is completed, but not successfully.
Causes of failure may be: one the steps failed, the `TaskRun` was cancelled or
the `TaskRun` timed out.

## PipelineRuns

`PipelineRun` events are generated for the following `Reasons`:
- `Succeeded`: this is triggered once all `Tasks` reachable via the DAG are
executed successfully.
- `Failed`: this is triggered if the `PipelineRun` is completed, but not
successfully. Causes of failure may be: one the `Tasks` failed or the
`PipelineRun` was cancelled.
2 changes: 2 additions & 0 deletions docs/pipelineruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ weight: 4
- [Specifying `LimitRange` values](#specifying-limitrange-values)
- [Configuring a failure timeout](#configuring-a-failure-timeout)
- [Cancelling a `PipelineRun`](#cancelling-a-pipelinerun)
- [Events](events.md#pipelineruns)



## Overview
Expand Down
1 change: 1 addition & 0 deletions docs/taskruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ weight: 2
- [Monitoring `Steps`](#monitoring-steps)
- [Monitoring `Results`](#monitoring-results)
- [Cancelling a `TaskRun`](#cancelling-a-taskrun)
- [Events](events.md#taskruns)
- [Code examples](#code-examples)
- [Example `TaskRun` with a referenced `Task`](#example-taskrun-with-a-referenced-task)
- [Example `TaskRun` with an embedded `Task`](#example-taskrun-with-an-embedded-task)
Expand Down
31 changes: 25 additions & 6 deletions pkg/reconciler/event.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package reconciler

import (
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/equality"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
"knative.dev/pkg/apis"
Expand All @@ -27,20 +28,38 @@ const (
EventReasonSucceded = "Succeeded"
// EventReasonFailed is the reason set for events about unsuccessful completion of TaskRuns / PipelineRuns
EventReasonFailed = "Failed"
// EventReasonStarted is the reason set for events about the start of TaskRuns / PipelineRuns
EventReasonStarted = "Started"
)

// EmitEvent emits success or failed event for object
// if afterCondition is different from beforeCondition
// EmitEvent emits an event for object if afterCondition is different from beforeCondition
//
// Status "ConditionUnknown":
// beforeCondition == nil, emit EventReasonStarted
// beforeCondition != nil, emit afterCondition.Reason
//
// Status "ConditionTrue": emit EventReasonSucceded
// Status "ConditionFalse": emit EventReasonFailed
//
func EmitEvent(c record.EventRecorder, beforeCondition *apis.Condition, afterCondition *apis.Condition, object runtime.Object) {
if beforeCondition != afterCondition && afterCondition != nil {
// Create events when the obj result is in.
if !equality.Semantic.DeepEqual(beforeCondition, afterCondition) && afterCondition != nil {
// If the condition changed, and the target condition is not empty, we send an event
switch afterCondition.Status {
case corev1.ConditionTrue:
c.Event(object, corev1.EventTypeNormal, EventReasonSucceded, afterCondition.Message)
case corev1.ConditionUnknown:
c.Event(object, corev1.EventTypeNormal, afterCondition.Reason, afterCondition.Message)
case corev1.ConditionFalse:
c.Event(object, corev1.EventTypeWarning, EventReasonFailed, afterCondition.Message)
case corev1.ConditionUnknown:
if beforeCondition == nil {
// If the condition changed, the status is "unknown", and there was no condition before,
// we emit the "Started event". We ignore further updates of the "unknown" status.
c.Event(object, corev1.EventTypeNormal, EventReasonStarted, "")
} else {
// If the condition changed, the status is "unknown", and there was a condition before,
// we emit an event that matches the reason and message of the condition.
// This is used for instance to signal the transition from "started" to "running"
c.Event(object, corev1.EventTypeNormal, afterCondition.Reason, afterCondition.Message)
}
}
}
}
107 changes: 87 additions & 20 deletions pkg/reconciler/event_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,38 +21,45 @@ import (
"time"

corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/tools/record"
"knative.dev/pkg/apis"
)

func TestEmitEvent(t *testing.T) {
testcases := []struct {
name string
before *apis.Condition
after *apis.Condition
expectEvent bool
name string
before *apis.Condition
after *apis.Condition
expectEvent bool
expectedEvent string
}{{
name: "unknown to true",
name: "unknown to true with message",
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
},
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
Message: "all done",
},
expectEvent: true,
expectEvent: true,
expectedEvent: "Normal Succeeded all done",
}, {
name: "true to true",
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
LastTransitionTime: apis.VolatileTime{Inner: metav1.NewTime(time.Now())},
},
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
LastTransitionTime: apis.VolatileTime{Inner: metav1.NewTime(time.Now().Add(5 * time.Minute))},
},
expectEvent: false,
expectEvent: false,
expectedEvent: "",
}, {
name: "false to false",
before: &apis.Condition{
Expand All @@ -63,23 +70,74 @@ func TestEmitEvent(t *testing.T) {
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
},
expectEvent: false,
expectEvent: false,
expectedEvent: "",
}, {
name: "unknown to unknown",
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
Reason: "",
Message: "",
},
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
Reason: "foo",
Message: "bar",
},
expectEvent: true,
expectedEvent: "Normal foo bar",
}, {
name: "true to nil",
after: nil,
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
},
expectEvent: true,
expectEvent: false,
expectedEvent: "",
}, {
name: "nil to true",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
},
expectEvent: true,
expectEvent: true,
expectedEvent: "Normal Succeeded ",
}, {
name: "nil to unknown with message",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
Message: "just starting",
},
expectEvent: true,
expectedEvent: "Normal Started ",
}, {
name: "unknown to false with message",
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
},
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
Message: "really bad",
},
expectEvent: true,
expectedEvent: "Warning Failed really bad",
}, {
name: "nil to false",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
},
expectEvent: true,
expectedEvent: "Warning Failed ",
}}

for _, ts := range testcases {
Expand All @@ -90,12 +148,21 @@ func TestEmitEvent(t *testing.T) {

select {
case event := <-fr.Events:
if ts.expectEvent && event == "" {
t.Errorf("Expected event but got empty for %s", ts.name)
if event == "" {
// The fake recorder reported empty, it should not happen
t.Fatalf("Expected event but got empty for %s", ts.name)
}
case <-timer.C:
if !ts.expectEvent {
t.Errorf("Unexpected event but got for %s", ts.name)
// The fake recorder reported an event which we did not expect
t.Errorf("Unxpected event \"%s\" but got one for %s", event, ts.name)
}
if !(event == ts.expectedEvent) {
t.Errorf("Expected event \"%s\" but got \"%s\" instead for %s", ts.expectedEvent, event, ts.name)
}
case <-timer.C:
if ts.expectEvent {
// The fake recorder did not report, the timer timeout expired
t.Errorf("Expected event but got none for %s", ts.name)
}
}
}
Expand Down
2 changes: 2 additions & 0 deletions pkg/reconciler/taskrun/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ const (
resyncPeriod = 10 * time.Hour
)

// NewController instantiates a new controller.Impl from knative.dev/pkg/controller
func NewController(images pipeline.Images) func(context.Context, configmap.Watcher) *controller.Impl {
return func(ctx context.Context, cmw configmap.Watcher) *controller.Impl {
logger := logging.FromContext(ctx)
Expand All @@ -66,6 +67,7 @@ func NewController(images pipeline.Images) func(context.Context, configmap.Watch
ConfigMapWatcher: cmw,
ResyncPeriod: resyncPeriod,
Logger: logger,
Recorder: controller.GetEventRecorder(ctx),
}

entrypointCache, err := pod.NewEntrypointCache(kubeclientset)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,7 @@ func cloudEventDeliveryFromTargets(targets []string) []v1alpha1.CloudEventDelive
}

// SendCloudEvents is used by the TaskRun controller to send cloud events once
// the TaskRun is complete. `tr` is used to obtain the list of targets but also
// to construct the body of the
// the TaskRun is complete. `tr` is used to obtain the list of targets
func SendCloudEvents(tr *v1alpha1.TaskRun, ceclient CEClient, logger *zap.SugaredLogger) error {
logger = logger.With(zap.String("taskrun", tr.Name))

Expand Down
20 changes: 13 additions & 7 deletions pkg/reconciler/taskrun/taskrun.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,16 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {

// If the TaskRun is just starting, this will also set the starttime,
// from which the timeout will immediately begin counting down.
tr.Status.InitializeConditions()
// In case node time was not synchronized, when controller has been scheduled to other nodes.
if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 {
c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime)
tr.Status.StartTime = &tr.CreationTimestamp
if !tr.HasStarted() {
tr.Status.InitializeConditions()
// In case node time was not synchronized, when controller has been scheduled to other nodes.
if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 {
c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime)
tr.Status.StartTime = &tr.CreationTimestamp
}
// Emit events
afterCondition := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, nil, afterCondition, tr)
}

// If the TaskRun is complete, run some post run fixtures when applicable
Expand Down Expand Up @@ -200,7 +205,6 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
if err = c.reconcile(ctx, tr, taskSpec, rtr); err != nil {
c.Logger.Errorf("Reconcile error: %v", err.Error())
}

// Emit events (only when ConditionSucceeded was changed)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
Expand Down Expand Up @@ -337,6 +341,9 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun,
if k8serrors.IsNotFound(err) {
// Keep going, this will result in the Pod being created below.
} else if err != nil {
// This is considered a transient error, so we return error, do not update
// the task run condition, and return an error which will cause this key to
// be requeued for reconcile.
c.Logger.Errorf("Error getting pod %q: %v", tr.Status.PodName, err)
return err
}
Expand Down Expand Up @@ -401,7 +408,6 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun,
}

c.Logger.Infof("Successfully reconciled taskrun %s/%s with status: %#v", tr.Name, tr.Namespace, tr.Status.GetCondition(apis.ConditionSucceeded))

return nil
}

Expand Down
Loading

0 comments on commit c76a6ad

Please sign in to comment.