Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of copying output resources to PVC(or intermediate storage location) #398

Closed
shashwathi opened this issue Jan 16, 2019 · 5 comments
Labels
design This task is about creating and discussing a design help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. meaty-juicy-coding-work This task is mostly about implementation!!! And docs and tests of course but that's a given

Comments

@shashwathi
Copy link
Contributor

shashwathi commented Jan 16, 2019

Expected Behavior

Copy resources onto PVC(or intermediate storage location) only if resource is being used further in pipelinerun

Actual Behavior

TaskRun copies are output resources to PVC irrespective of it being used further in taskrun in pipelinerun or not.

Steps to Reproduce the Problem

  1. Create pipelinerun with a task
  2. Check the pod spec to see 2 steps (in the end) for create a directory in PVC and copying contents

Additional Info

Could consider using DAG functionality to determine whether a resource in output is used in further tasks and make the right decision

@afrittoli
Copy link
Member

Output/logs will still be on the Task own PVC, right, so this would only change whether they are copied on the pipeline PVC, right?

@bobcatfish
Copy link
Collaborator

hm interesting, I wonder how we can make the TaskRun aware of this without making it too aware of or coupled to Pipeline runs 🤔

@shashwathi is this mostly to save time? or save storage space? something else?

Output/logs will still be on the Task own PVC, right, so this would only change whether they are copied on the pipeline PVC, right?

That's right @afrittoli but just note that for #387 it's pretty likely we'll end up getting rid of that log PVC!

@bobcatfish bobcatfish added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. design This task is about creating and discussing a design meaty-juicy-coding-work This task is mostly about implementation!!! And docs and tests of course but that's a given labels Jan 22, 2019
@shashwathi
Copy link
Contributor Author

is this mostly to save time?

Yes mostly to avoid adding steps that are not adding value. It also saves execution time of taskrun

Logs are not in the scope of this issue. It is about intelligently handling output resources instead of copying (in dumb manner?)

@cccfeng
Copy link
Contributor

cccfeng commented Jul 1, 2019

Hi everyone, I totally agree with what @shashwathi say.

In the following user scenarios, whether a pipeline containing only one task is defined or an output resource in a task is not referenced by subsequent tasks, Tekton may not need to copy output resources into PVC or remote bucket. This logic is very clever, but it may not be very appropriate as the default logic.

I think it's better to parameterize this action and let the user configure it visually, rather than deduce the logic of copying output resource from DAG. Users will be aware of this, just as for a CI task run-out artifact, users can specify that it is only copied to a remote bucket for storage, or whether it is copied to a PVC pulled up by pipeline for subsequent ls/cat actions or other purposes.

Proposal

Whether it can be or not to extend pipeline.Spec.tasks[n].resources.outputs[n] (PipelineTaskOutputResource) to have such as CustomUploadOnly bool (or something more expressive) attribute

  • CustomUploadOnly with default value false , output resource will be copied into pvc/bucket that retain the logic before
  • When set customUploadOnly to true , output resource will not be copied into pvc/bucket by system. Users can use GCS or any other extended pipeline resources, such as OSS to store artifact

Yaml

kind: Pipeline
metadata:
  name: demo-pipeline
spec:
  resources:
  - name: source-repo
    type: git
  - name: ci-artifact
    type: storage
  tasks:
  - name: xxxx-ci-pt
    taskRef:
      name: xxx-ci-task
    resources:
      inputs:
      - name: workspace
        resource: source-repo
      outputs:
      - name: builtImage
        resource: web-image
        customUploadOnly: true

I am looking forward to your reply , thanks.

@ghost
Copy link

ghost commented Apr 9, 2020

I believe this is done now. A PVC is only created and copied to if an input links from an output. Feel free to reopen if there's more to do here!

@ghost ghost closed this as completed Apr 9, 2020
piyush-garg pushed a commit to piyush-garg/pipeline that referenced this issue May 22, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design This task is about creating and discussing a design help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. meaty-juicy-coding-work This task is mostly about implementation!!! And docs and tests of course but that's a given
Projects
None yet
Development

No branches or pull requests

4 participants