-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Link git resources from previous task #255
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shashwathi If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Nice! The design approach sounds reasonable to me at first glance, I'll dig into the code later to make sure I understand it. =) I assume this general approach could be used for the sort of thing I was asking about in Slack earlier, i.e., passing a file or set of files from one Task to another? (like, say, building a war file in the build task and then passing it on to the test task to do stuff with) I know that's not something that feels particularly k8s-native, but I am still thinking about how to use Pipeline CRD as the guts of a general-purpose pipeline execution engine that runs on k8s, but isn't necessarily deploying to k8s, if that makes any sense. |
/test pull-knative-build-pipeline-integration-tests |
|
@shashwathi - done! #259 |
I think this approach makes a lot of sense @shashwathi ! I need to look at the code in more detail as well but one question at the outset about fan out and fan in: I can see that you have an example that has a Task getting a git resource from two previous Tasks:
What happens if both |
- TODO: refactor, tests
- start adding unit test
The following is the coverage report on pkg/.
|
@shashwathi: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@shashwathi I like the design approach, using PVC seems like the most reasonable option, but I do think that once the build code is migrated to task, the code can be made more straightforward @bobcatfish I don't think this is a scenario that should be allowed, two tasks updating the same git resource when running in parallel and a third task depending on both of them. We might need to have some kind of pipeline consistency checker/validator for invalid scenarios |
Design make sense for me too (well I'm late to teh party to say that 😛)
@pivotal-nader-ziada agreeing on that ; validing that, we could "enforce" that people uses different named resource in those cases (or no parallel execution) |
|
||
func (pr *PipelineRun) GetPVC() *corev1.PersistentVolumeClaim { | ||
var pvcSizeBytes int64 | ||
pvcSizeBytes = 5 * 1024 * 1024 * 1024 // 5 GBs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most likely in a follow-up but we may want to make this configurable
} | ||
newSteps = append(newSteps, []corev1.Container{{ | ||
Name: fmt.Sprintf("source-copy-make-%s", source.Name), | ||
Image: "ubuntu", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We way want to use a smaller image like busybox
(as knative/build
if I'm correct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed. No need for such a big image for copying purpose.
}, { | ||
Name: fmt.Sprintf("source-copy-%s", source.Name), | ||
Image: "ubuntu", | ||
Command: []string{"/bin/cp"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
genuine naive question : cp
or rsync
? 😋
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp
. I did consider rsync but from my understanding it seems to be a better tool if parts of files are updated instead of complete set of files.
closing this in favor of #270 |
Fixes tektoncd#255 Adds validation to the TriggerTemplate resource. Validation will fail when a ResourceTemplate uses a parameter not already declared in spec.params.
In this PR I am proposing a solution for passing constraints between tasks.
Considered solutions:
external blob storage like s3, gcs
Pros: Operator can configure bucket at controller level or at pipeline level. Easy to create directory with pipelinerun/taskrun_names to never clash
Cons: It is extra burden on operator to set this up(main pain point).
Additionally bucket becomes an external entity out of k8s ecosystem. If controller crasher or objects get deleted then it is possible buckets might be orphaned.
external server like s3 which is backed by PVC
Pros: Operator does not have to setup anything. Only one PVC possibly backing this web server which will take take requests to write to file system and fetch from file system.
Cons: I would be re-implementing a bad s3 or gcs version :D Handle concurrent read, write requests. I would have to make make the requests efficient by compressing file systems so avg pipelinerun time would become higher. Lot of writing new code which is never a great idea.
In this PR I am choosing the route of using PVC within lifetime of pipelinerun
Pros: Operator doesn't have to worry about any external work. PVC is created and deleted in the lifetime of pipelinerun.
cons: In few IAASes creating PVC is time consuming. Possibly operators could make this easy by preallocating disks ahead of time or some other potential solution to reuse from pool(I have not thought fool proof plan to solve this yet).
Compared to other solutions I am leaning towards this as this does not include any new setup for operator or user. Objects created (PVC) are all in same k8s ecosystem and cleaned up as pipelinerun is completed.
Now onto implementation details
PVC with pipelinerun name is created as pipelinerun starts and this information is passed onto taskrun objects created to construct build steps.
cp -r source destination
containers. Incase of prebuildsteps, build source would have already created the path so this step overrides existing source folder with altered previous source. Incase of postbuildsteps, build source will first make the dir in pvc and the copy the resource folder onto pvc(hence 2 steps, could be merged as an improvement).TODO:
You might be wondering with no tests, how do I know this PR works?
I have added 2 pipelinerun samples under
examples/test-pvc
,examples/test-pvc-2
which implement 2 different pipeline stylesexamples/test-pvc
- This is the vanilla use case of github repo being passed form task 1-> task2examples/test-pvc-2
- This is slightly complex use case of task 1, task 2 altering a git resource and task 3 checks that changed in task 1,2 are presentWhat more changes does this PR include?
/workspace
always, this PR would be too easy in that case. When we add support for target path it would be too complex to figure out hence I have included the complexity to support future and existing use cases.examples/test-pvc
includes test with targetpath. I am also trying to show that proposed solution is not fragile.What will not be included?
This PR only includes support for git resources. Support for images includes some design of metadata and I would like to follow up with another PR for that.
I would like hear your thoughts, concerns about the approach and implementation. I will add required tests when I have 👍
cc @bobcatfish @pivotal-nader-ziada @imjasonh @abayer