Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support artifact outputs #92

Merged
merged 13 commits into from
Apr 21, 2020
Merged

Conversation

Tomcli
Copy link
Member

@Tomcli Tomcli commented Apr 10, 2020

Which issue is resolved by this Pull Request:
Related #64

Description of your changes:
After several experiment and discussion at #64, I decided to use the minio/mc image to copy artifacts to the object storage. Since the artifact feature requires the kfp setup, I make it optional for now until we are running Tekton pipeline using kfp's api.

Also, since several Argo variables are not supported in Tekton, we have to workaround and get the pipelinerun and pod name from the metadata and pass it as environment variable. We also move the Argo variable parsing right before we export the yaml format to avoid substitution conflicts.

If we agree with this approach, then I will also implement the artifact inputs the same way in a follow up PR.

Environment tested:

  • Python Version (use python --version): 3.6.4
  • Tekton Version (use tkn version): 0.11
  • Kubernetes Version (use kubectl version): 1.15
  • OS (e.g. from /etc/os-release):

@kubeflow-bot
Copy link

This change is Reviewable

Copy link
Member

@ckadner ckadner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Tomcli, when I run the pipeline, I get the following output:

tkn pipelinerun logs artifact-location-pipeline-run-56pcj:

task foo has failed: Missing or invalid Task tekton-pipelines/foo: Pod "artifact-location-pipeline-run-56pcj-foo-wqbwh-pod-ktjck" is invalid: [spec.volumes[6].name: Invalid value: "out_art": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), spec.containers[0].volumeMounts[0].name: Not found: "out_art", spec.containers[1].volumeMounts[0].name: Not found: "out_art"]
pod for taskrun artifact-location-pipeline-run-56pcj-foo-wqbwh not available yet
TaskRun artifact-location-pipeline-run-56pcj-foo-wqbwh has failed

When I change the volume name out_art the pipeline.yaml to output-artifact I do not get the error but the pipeline never terminates

sdk/README.md Outdated Show resolved Hide resolved
sdk/python/tests/compiler/testdata/artifact_location.py Outdated Show resolved Hide resolved
sdk/python/tests/compiler/testdata/artifact_location.py Outdated Show resolved Hide resolved
@ckadner ckadner mentioned this pull request Apr 15, 2020
27 tasks
@Tomcli
Copy link
Member Author

Tomcli commented Apr 15, 2020

Hi @Tomcli, when I run the pipeline, I get the following output:

tkn pipelinerun logs artifact-location-pipeline-run-56pcj:

task foo has failed: Missing or invalid Task tekton-pipelines/foo: Pod "artifact-location-pipeline-run-56pcj-foo-wqbwh-pod-ktjck" is invalid: [spec.volumes[6].name: Invalid value: "out_art": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), spec.containers[0].volumeMounts[0].name: Not found: "out_art", spec.containers[1].volumeMounts[0].name: Not found: "out_art"]
pod for taskrun artifact-location-pipeline-run-56pcj-foo-wqbwh not available yet
TaskRun artifact-location-pipeline-run-56pcj-foo-wqbwh has failed

When I change the volume name out_art the pipeline.yaml to output-artifact I do not get the error but the pipeline never terminates

Thanks, I updated the test to use a real example since most of the kfp unit test doesn't run as a pipeline. The goal for this PR is to behave as Argo where all the output parameters are treated as artifacts and stored in object storage.

@Tomcli
Copy link
Member Author

Tomcli commented Apr 15, 2020

With this example, you should able to see the minio's mlpipeline bucket is storing all the output parameters.

@@ -45,37 +49,40 @@ spec:
valueFrom:
secretKeyRef:
key: accesskey
name: minio
name: $(inputs.params.secret_name)
Copy link
Member

@ckadner ckadner Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tommy,

this seems to not work out-of-the box. Pipeline does not show any progress, there are no logs and nothing stored in Minio.

$ tkn pipeline start custom-artifact-location-pipeline --showlog:

? Value for param `secret_name` of type `string`? (Default is `mlpipeline-minio-artifact`) mlpipeline-minio-artifact
? Value for param `tag` of type `string`? (Default is `1.31.0`) 1.31.0
? Value for param `namespace` of type `string`? (Default is `kubeflow`) kubeflow
? Value for param `bucket` of type `string`? (Default is `mlpipeline`) mlpipeline
Pipelinerun started: custom-artifact-location-pipeline-run-6mcxt
Waiting for logs to be available...
^C

... and later ...
$ tkn pipeline logs custom-artifact-location-pipeline --last:

Pipeline still running ...
No logs found

But when I change it back to

    - name: AWS_ACCESS_KEY_ID
      value: minio
    - name: AWS_SECRET_ACCESS_KEY
      value: minio123

Then it works and I get this output:

? Value for param `secret_name` of type `string`? (Default is `mlpipeline-minio-artifact`) mlpipeline-minio-artifact
? Value for param `tag` of type `string`? (Default is `1.31.0`) 1.31.0
? Value for param `namespace` of type `string`? (Default is `kubeflow`) kubeflow
? Value for param `bucket` of type `string`? (Default is `mlpipeline`) mlpipeline
Pipelinerun started: custom-artifact-location-pipeline-run-d2x6l
Waiting for logs to be available...

[foo : copy-artifacts] Added `storage` successfully.
[foo : copy-artifacts] `/tekton/results/output` -> `storage/mlpipeline/runs/custom-artifact-location-pipeline-run-d2x6l/custom-artifact-location-pipeline-run-d2x6l-foo-2v5gf-pod-86ll6/output.txt`
[foo : copy-artifacts] Total: 0 B, Transferred: 6 B, Speed: 565 B/s

...along with the file in showing up in Minio.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this pipeline, you have to run under the same namespace as kfp, so by default you should run it under the kubeflow namespace. If you want to run on different namespace, then you have to create a secret with the s3 key and secret using the same secret_name in the parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking from a general integration test suite, it would be great if we can run this (and all other) pipeline as is. In order to use Tekton I am using a different default namespace. Can we just use the minio and minio123 credentials since they are not really secret?

Copy link
Member Author

@Tomcli Tomcli Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DSL is designed to take the credentials from secret, even the default setting is reading from a secret. Is it possible for you to create/copy the mlpipeline-minio-artifact from kubeflow to your default namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For automation this would not be practical. Maybe we can run all pipelines in the kubeflow namespace and clean up afterwards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of CI tests in kubeflow assume the kubernetes namespace is setup with all the necessary dependencies such as gcp secret, image pull secret, service account, etc. Probably we should create some sort of CI setup script for Tekton as well. Many of the design decision I made is to align with how Argo behaves, so kfp can reuse the same set up for Tekton.

sdk/python/tests/compiler/testdata/artifact_location.py Outdated Show resolved Hide resolved
sdk/README.md Show resolved Hide resolved
secret_access_key = storage_location.get("secretKeySecret", {"name": "mlpipeline-minio-artifact", "key": "secretkey"})
bucket = storage_location.get("bucket", "mlpipeline")
copy_artifacts_step = {
'image': 'minio/mc',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block is growing really large, should we move this into a separate funtion/method and maybe we could make the copy_artifacts_step into a Python template string. In the future we could keep all the highly descriptive code snippets (YAML with a few parameters) in a separate folder as opposed to mixing it with the Python code. We could track that in a separate issue :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, there are big chunk of code templates introduced for resourceOp and artifacts because we need to use an extra step/task to support these additional features. There's still some merging happening for volume and volume snapshot op along with some new changes to resource op that Vincent did yesterday. I can open an issue to refactor this file and compiler.py.

@@ -45,37 +49,40 @@ spec:
valueFrom:
secretKeyRef:
key: accesskey
name: minio
name: $(inputs.params.secret_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For automation this would not be practical. Maybe we can run all pipelines in the kubeflow namespace and clean up afterwards.

@dsl.pipeline(
name="custom_artifact_location_pipeline",
description="""A pipeline to demonstrate how to configure the artifact
location for all the ops in the pipeline.""",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can add another comment here that this assumes the pipeline will be deployed/run in the kubeflow namespace?

Copy link
Member

@ckadner ckadner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GREAT! Thanks @Tomcli.

/lgtm
/assign @animeshsingh

@ckadner ckadner removed their assignment Apr 16, 2020
@Tomcli
Copy link
Member Author

Tomcli commented Apr 20, 2020

@ckadner @animeshsingh
This PR is ready to go. The reasons I picked the minio image is because gsutil image can't work on all type of object storage circumstances and the details are in this issue. #64 (comment)

Let me know if you have any question on this PR.

@Tomcli Tomcli mentioned this pull request Apr 21, 2020
@animeshsingh
Copy link
Collaborator

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: animeshsingh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@animeshsingh
Copy link
Collaborator

/lgtm

@k8s-ci-robot k8s-ci-robot merged commit 3d57d61 into kubeflow:master Apr 21, 2020
@Tomcli Tomcli deleted the output-artifact branch May 20, 2020 17:20
HumairAK referenced this pull request in red-hat-data-services/data-science-pipelines-tekton May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants