Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit events when we fail to update the taskrun #2526

Merged
merged 1 commit into from
May 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions pkg/reconciler/event.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ const (
EventReasonFailed = "Failed"
// EventReasonStarted is the reason set for events about the start of TaskRuns / PipelineRuns
EventReasonStarted = "Started"
// EventReasonError is the reason set for events related to TaskRuns / PipelineRuns reconcile errors
EventReasonError = "Error"
)

// EmitEvent emits an event for object if afterCondition is different from beforeCondition
Expand Down Expand Up @@ -63,3 +65,10 @@ func EmitEvent(c record.EventRecorder, beforeCondition *apis.Condition, afterCon
}
}
}

// EmitErrorEvent emits a failure associated to an error
func EmitErrorEvent(c record.EventRecorder, err error, object runtime.Object) {
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels a bit odd to have this function take an error as an argument vs checking the error in the place where it originates - is this a pattern we do other places as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a bit different because in this case the error is the event that we want to send.

On knative they went a bit further, they have now an event object that also implements the error interface.
They have a generated Reconcile that invokes a ReconcileKind which returns an event that may be an error - at least that's how I understood it - so this goes a bit in that direction.

I could check if the err is nil at the source, but it felt safer to do here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan for upcoming patches was to centralize the event logic in event.go i.e. to receive all requests events / error notifications from the taskrun/pipelinerun controller in there, and to decide there when to emit what, k8s event and/or cloudevent.

c.Event(object, corev1.EventTypeWarning, EventReasonError, err.Error())
}
}
107 changes: 65 additions & 42 deletions pkg/reconciler/event_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ limitations under the License.
package reconciler

import (
"errors"
"fmt"
"strings"
"testing"
"time"

Expand All @@ -28,11 +31,10 @@ import (

func TestEmitEvent(t *testing.T) {
testcases := []struct {
name string
before *apis.Condition
after *apis.Condition
expectEvent bool
expectedEvent string
name string
before *apis.Condition
after *apis.Condition
wantEvent string
}{{
name: "unknown to true with message",
before: &apis.Condition{
Expand All @@ -44,8 +46,7 @@ func TestEmitEvent(t *testing.T) {
Status: corev1.ConditionTrue,
Message: "all done",
},
expectEvent: true,
expectedEvent: "Normal Succeeded all done",
wantEvent: "Normal Succeeded all done",
}, {
name: "true to true",
before: &apis.Condition{
Expand All @@ -58,8 +59,7 @@ func TestEmitEvent(t *testing.T) {
Status: corev1.ConditionTrue,
LastTransitionTime: apis.VolatileTime{Inner: metav1.NewTime(time.Now().Add(5 * time.Minute))},
},
expectEvent: false,
expectedEvent: "",
wantEvent: "",
}, {
name: "false to false",
before: &apis.Condition{
Expand All @@ -70,8 +70,7 @@ func TestEmitEvent(t *testing.T) {
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
},
expectEvent: false,
expectedEvent: "",
wantEvent: "",
}, {
name: "unknown to unknown",
before: &apis.Condition{
Expand All @@ -86,26 +85,23 @@ func TestEmitEvent(t *testing.T) {
Reason: "foo",
Message: "bar",
},
expectEvent: true,
expectedEvent: "Normal foo bar",
wantEvent: "Normal foo bar",
}, {
name: "true to nil",
after: nil,
before: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
},
expectEvent: false,
expectedEvent: "",
wantEvent: "",
}, {
name: "nil to true",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionTrue,
},
expectEvent: true,
expectedEvent: "Normal Succeeded ",
wantEvent: "Normal Succeeded ",
}, {
name: "nil to unknown with message",
before: nil,
Expand All @@ -114,8 +110,7 @@ func TestEmitEvent(t *testing.T) {
Status: corev1.ConditionUnknown,
Message: "just starting",
},
expectEvent: true,
expectedEvent: "Normal Started ",
wantEvent: "Normal Started ",
}, {
name: "unknown to false with message",
before: &apis.Condition{
Expand All @@ -127,43 +122,71 @@ func TestEmitEvent(t *testing.T) {
Status: corev1.ConditionFalse,
Message: "really bad",
},
expectEvent: true,
expectedEvent: "Warning Failed really bad",
wantEvent: "Warning Failed really bad",
}, {
name: "nil to false",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionFalse,
},
expectEvent: true,
expectedEvent: "Warning Failed ",
wantEvent: "Warning Failed ",
}}

for _, ts := range testcases {
fr := record.NewFakeRecorder(1)
tr := &corev1.Pod{}
EmitEvent(fr, ts.before, ts.after, tr)
timer := time.NewTimer(1 * time.Second)

select {
case event := <-fr.Events:
if event == "" {
// The fake recorder reported empty, it should not happen
t.Fatalf("Expected event but got empty for %s", ts.name)
}
if !ts.expectEvent {
// The fake recorder reported an event which we did not expect
t.Errorf("Unxpected event \"%s\" but got one for %s", event, ts.name)
}
if !(event == ts.expectedEvent) {
t.Errorf("Expected event \"%s\" but got \"%s\" instead for %s", ts.expectedEvent, event, ts.name)
}
case <-timer.C:
if ts.expectEvent {
// The fake recorder did not report, the timer timeout expired
t.Errorf("Expected event but got none for %s", ts.name)
}
err := checkEvents(t, fr, ts.name, ts.wantEvent)
if err != nil {
t.Errorf(err.Error())
}
}
}

func TestEmitErrorEvent(t *testing.T) {
afrittoli marked this conversation as resolved.
Show resolved Hide resolved
testcases := []struct {
name string
err error
wantEvent string
}{{
name: "with error",
err: errors.New("something went wrong"),
wantEvent: "Warning Error something went wrong",
}, {
name: "without error",
err: nil,
wantEvent: "",
}}

for _, ts := range testcases {
fr := record.NewFakeRecorder(1)
tr := &corev1.Pod{}
EmitErrorEvent(fr, ts.err, tr)

err := checkEvents(t, fr, ts.name, ts.wantEvent)
if err != nil {
t.Errorf(err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to add some context, e.g. "expected error when blah blah"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context is added in the error by checkEvents already.
I could add unexpected error when checking for events: %#v", err.Error() but I'm not sure it would add much.

}
}
}

func checkEvents(t *testing.T, fr *record.FakeRecorder, testName string, wantEvent string) error {
t.Helper()
timer := time.NewTimer(1 * time.Second)
select {
case event := <-fr.Events:
if wantEvent == "" {
return fmt.Errorf("received event \"%s\" for %s but none expected", event, testName)
}
if !(strings.HasPrefix(event, wantEvent)) {
return fmt.Errorf("expected event \"%s\" but got \"%s\" instead for %s", wantEvent, event, testName)
}
case <-timer.C:
if wantEvent != "" {
return fmt.Errorf("received no events for %s but %s expected", testName, wantEvent)
}
}
return nil
}
afrittoli marked this conversation as resolved.
Show resolved Hide resolved
23 changes: 11 additions & 12 deletions pkg/reconciler/taskrun/taskrun.go
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,7 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
before := tr.Status.GetCondition(apis.ConditionSucceeded)
message := fmt.Sprintf("TaskRun %q was cancelled", tr.Name)
err := c.failTaskRun(tr, v1beta1.TaskRunReasonCancelled, message)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
return multierror.Append(err, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
return c.finishReconcileUpdateEmitEvents(tr, original, before, err)
}

// Check if the TaskRun has timed out; if it is, this will set its status
Expand All @@ -183,9 +181,7 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
before := tr.Status.GetCondition(apis.ConditionSucceeded)
message := fmt.Sprintf("TaskRun %q failed to finish within %q", tr.Name, tr.GetTimeout())
err := c.failTaskRun(tr, podconvert.ReasonTimedOut, message)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
return multierror.Append(err, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
return c.finishReconcileUpdateEmitEvents(tr, original, before, err)
}

// prepare fetches all required resources, validates them together with the
Expand All @@ -194,11 +190,9 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
taskSpec, rtr, err := c.prepare(ctx, tr)
if err != nil {
c.Logger.Errorf("TaskRun prepare error: %v", err.Error())
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, nil, after, tr)
// We only return an error if update failed, otherwise we don't want to
// reconcile an invalid TaskRun anymore
return c.updateStatusLabelsAndAnnotations(tr, original)
return c.finishReconcileUpdateEmitEvents(tr, original, nil, nil)
}

// Store the condition before reconcile
Expand All @@ -210,10 +204,15 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
c.Logger.Errorf("Reconcile error: %v", err.Error())
}
// Emit events (only when ConditionSucceeded was changed)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
return c.finishReconcileUpdateEmitEvents(tr, original, before, err)
}

return multierror.Append(err, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
func (c *Reconciler) finishReconcileUpdateEmitEvents(tr, original *v1beta1.TaskRun, beforeCondition *apis.Condition, previousError error) error {
afterCondition := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, beforeCondition, afterCondition, tr)
err := c.updateStatusLabelsAndAnnotations(tr, original)
reconciler.EmitErrorEvent(c.Recorder, err, tr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally i think id rather see an explicit "if err" here vs having EmitErrorEvent handle it, just cuz it doesnt seem consistent with how we usually do error handling - maybe you have a bigger plan around this tho that im not seeing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered https://github.com/knative/pkg/tree/master/injection#generated-reconciler-responsibilities after I wrote this patch. Maybe I should just try and adopt that :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to centralize the logic about what event to send when of what type in events.go, so I included there the check for nil too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havent seen https://github.com/knative/pkg/tree/master/injection#generated-reconciler-responsibilities before! Glancing at it I'm not 100% sure what implications it has for this PR, im really glad to see some docs for pkg tho.

I'm trying to centralize the logic about what event to send when of what type in events.go, so I included there the check for nil too

okay, maybe it will make more sense to me as it evolves

return multierror.Append(previousError, err).ErrorOrNil()
}

func (c *Reconciler) getTaskResolver(tr *v1beta1.TaskRun) (*resources.LocalTaskRefResolver, v1beta1.TaskKind) {
Expand Down
11 changes: 10 additions & 1 deletion test/cancel_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ package test

import (
"encoding/json"
"fmt"
"sync"
"testing"

Expand Down Expand Up @@ -171,14 +172,22 @@ func TestTaskRunPipelineRunCancel(t *testing.T) {
wg.Wait()

matchKinds := map[string][]string{"PipelineRun": {"pear"}, "TaskRun": trName}
// Expected failure events: 1 for the pipelinerun cancel, 1 for each TaskRun
expectedNumberOfEvents := 1 + len(trName)
t.Logf("Making sure %d events were created from pipelinerun with kinds %v", expectedNumberOfEvents, matchKinds)
events, err := collectMatchingEvents(c.KubeClient, namespace, matchKinds, "Failed")
if err != nil {
t.Fatalf("Failed to collect matching events: %q", err)
}
if len(events) != expectedNumberOfEvents {
t.Fatalf("Expected %d number of successful events from pipelinerun and taskrun but got %d; list of received events : %#v", expectedNumberOfEvents, len(events), events)
collectedEvents := ""
for i, event := range events {
collectedEvents += fmt.Sprintf("%#v", event)
if i < (len(events) - 1) {
collectedEvents += ", "
}
}
t.Fatalf("Expected %d number of successful events from pipelinerun and taskrun but got %d; list of received events : %#v", expectedNumberOfEvents, len(events), collectedEvents)
afrittoli marked this conversation as resolved.
Show resolved Hide resolved
}
})
}
Expand Down