-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Conditions in the BuildRun resource #515
Add support for Conditions in the BuildRun resource #515
Conversation
Add test cases for the three main status condition: - unknown (ramp-up and early phases of buildrun) - true (buildrun runs through fine) - false (error scenarios like timeout or misconfiguration) Signed-off-by: Enrique Eduardo Encalada Olivas <[email protected]>
Add function to translate the taskrun condition into a buildrun one. Signed-off-by: Enrique Eduardo Encalada Olivas <[email protected]>
Externalize logic to get the effective timeout from build and builrun tuple into a separate function. Signed-off-by: Enrique Eduardo Encalada Olivas <[email protected]>
3bac41b
to
c0516bd
Compare
case v1beta1.TaskRunReasonTimedOut: | ||
reason = "BuildRunTimeout" | ||
message = fmt.Sprintf("BuildRun %s failed to finish within %s", | ||
taskRun.GetLabels()[buildv1alpha1.LabelBuildRun], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - we probably don't need the name since in the message will be inside the BuildRun object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a user that is looking at the Conditions directly from the BuildRun
I agree, but for the scenarios where someone is building a workflow on top(e.g. Shipwright/CLI), this will help to make the message very explicit. Therefore I would opt to not change this.
) | ||
|
||
case v1beta1.TaskRunReasonFailed: | ||
if taskRun.Status.CompletionTime != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the TaskRun is not complete? Would we expect this to be a time out error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory no. If the TaskRun Condition.Reason is Failed
, this means that something failed in the Pod and Completion will be set always after the Failed
is set in the TaskRun Condition.Reason. You can dive in the code to verify that part.
Also, Tekton have this nice table that illustrates the above, see table from tekton. You will find there that whevener the Reason
is Failed
, the CompletionTime is set.
Therefore, to answer your question, a TaskRun
cannot be Failed
and "not complete" .
} | ||
|
||
if failedContainer != nil { | ||
message = fmt.Sprintf("buildrun step failed in pod %s, for detailed information: kubectl --namespace %s logs %s --container=%s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we bubble up terminated state message from the pod here, rather than a "look at the logs" message? https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#containerstateterminated-v1-core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adambkaplan interesting, let me play with that tomorrow, and then I can give you a proper answer. Note: This constructed message is very similar to what Tekton does on a failed Taskrun, it provides you some metadata + the kubectl
cmd to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adambkaplan I took a look on this, this will not work, while Tekton does not exposes internal results. They override the message with InternalTektonResult
, this will look like:
state:
terminated:
containerID: containerd://fd0627262107d2339abc75261afd4ca66a91f1b6a3f19e2a3ae68a9ff817138e
exitCode: 1
finishedAt: "2020-12-14T17:06:19Z"
message: '[{"key":"StartedAt","value":"2020-12-14T17:06:19.706Z","type":"InternalTektonResult"}]'
reason: Error
startedAt: "2020-12-14T17:06:13Z"
I think the current approach is good for us, also it will help users to move into the right direction, which is "getting the logs of the failed step".
531be8a
to
7d9b581
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sry for the late feedback. Looks good. Just a few suggestions.
7d9b581
to
f735d89
Compare
Change `Succeeded` column to use the Succeeded condition Status field. Change `Reason` column to use the Succeeded condition Ready field. Add `Message` column based on the Succeeded condition Message field.
Update `buildrun.md` documentation with a new example that shows the new columns.
Small refactoring to ensure we are asserting the fields of the Conditions object, rather than the old Status fields Signed-off-by: Matthias Diester <[email protected]>
f735d89
to
c0ffc16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SaschaSchwarze0 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Sry @adambkaplan, this was not desired. Wanted to wait for your feedback. I still do not understand why the approved tag is added for me when I did the GitHub approve vs this did not happen for Jordan. If my |
Don't know if you got the answer to your question @SaschaSchwarze0 but a lgtm from an approver in the OWNERS file adds both lgtm and approved labels.... your ID is listed as an approver in https://github.com/shipwright-io/build/blob/master/OWNERS for example, at least with the latest version I don't know the precise github wiring that makes this happen but it is the same in the openshift repos. By comparison, @zhangtbj 's review approval via #515 (review) is not the same as an PR lgtm or approve wrt the github wiring I am referring to. |
I've added discussing any post merge actions to the Jan 11 community agenda ^^ |
Duh, forgot the automatic approve process provides a link to the process and how things work. The lgtm from owner adding approve quirk is discussed at https://github.com/kubernetes/community/blob/master/contributors/guide/owners.md#quirks-of-the-process This is all part of the k8s prow infrastructure used by k8s and openshift, which has been adopted for our repo.
|
Interesting. I am not sure if I operated something wrong before. As I know, there are many reviewers for a PR. and if I review this PR or be asked to review this PR and after it looks good to me. I will or what I thought:
But seems the last action is done automatically if approver approve the PR before? But for sure, let us discuss and confirm the correct process in next meeting. Thanks! |
Yeah, probably, for somebody who worked on the OSS space for some time, this is normal. For somebody new it is imo not intuitive: there are the Anyway, joking now, software engineers often tend to try to type as less as possible while still doing as much as possible. I personal am capable and have the time to write both commands. But anyway, software engineers also need to be capable of learning non-logical things as they exist everywhere. Cheers. |
Fixes #307
In a nutshell this PR adds support for Conditions, but keep the old fields
Status.Reason
andStatus.Succeeded
for backwards compatibility.(we should remove them at some point in the future).Based on each commit message(6 commits), this PR is doing the following changes:
Add condition test cases to integration test. Adds test cases for the three main status condition:
Introduce conditions for buildruns
Refactor build and buildrun timeout logic
Change CRD printer columns to use conditions
Succeeded
column to use the Succeeded condition Status field.Reason
column to use the Succeeded condition Ready field.Message
column based on the Succeeded condition Message field.Update docs with new printer output
buildrun.md
documentation with a new example that shows the new columns.Update e2e tests with BuildRun Conditions
If you are wondering how this looks:
For Succeeded BuildRuns:
For Failed BuildRuns:
On this scenario, we introduced a new field, that will be only populated on failures. This is useful for third-parties that wanna get the exact container that failed on the pod, by consuming the values of this new field(status.failedAt)
For BuildRuns that timedOut: