-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support artifact outputs #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Tomcli, when I run the pipeline, I get the following output:
tkn pipelinerun logs artifact-location-pipeline-run-56pcj
:
task foo has failed: Missing or invalid Task tekton-pipelines/foo: Pod "artifact-location-pipeline-run-56pcj-foo-wqbwh-pod-ktjck" is invalid: [spec.volumes[6].name: Invalid value: "out_art": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), spec.containers[0].volumeMounts[0].name: Not found: "out_art", spec.containers[1].volumeMounts[0].name: Not found: "out_art"]
pod for taskrun artifact-location-pipeline-run-56pcj-foo-wqbwh not available yet
TaskRun artifact-location-pipeline-run-56pcj-foo-wqbwh has failed
When I change the volume name out_art
the pipeline.yaml
to output-artifact
I do not get the error but the pipeline never terminates
Thanks, I updated the test to use a real example since most of the kfp unit test doesn't run as a pipeline. The goal for this PR is to behave as Argo where all the output parameters are treated as artifacts and stored in object storage. |
With this example, you should able to see the minio's |
@@ -45,37 +49,40 @@ spec: | |||
valueFrom: | |||
secretKeyRef: | |||
key: accesskey | |||
name: minio | |||
name: $(inputs.params.secret_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Tommy,
this seems to not work out-of-the box. Pipeline does not show any progress, there are no logs and nothing stored in Minio.
$ tkn pipeline start custom-artifact-location-pipeline --showlog
:
? Value for param `secret_name` of type `string`? (Default is `mlpipeline-minio-artifact`) mlpipeline-minio-artifact
? Value for param `tag` of type `string`? (Default is `1.31.0`) 1.31.0
? Value for param `namespace` of type `string`? (Default is `kubeflow`) kubeflow
? Value for param `bucket` of type `string`? (Default is `mlpipeline`) mlpipeline
Pipelinerun started: custom-artifact-location-pipeline-run-6mcxt
Waiting for logs to be available...
^C
... and later ...
$ tkn pipeline logs custom-artifact-location-pipeline --last
:
Pipeline still running ...
No logs found
But when I change it back to
- name: AWS_ACCESS_KEY_ID
value: minio
- name: AWS_SECRET_ACCESS_KEY
value: minio123
Then it works and I get this output:
? Value for param `secret_name` of type `string`? (Default is `mlpipeline-minio-artifact`) mlpipeline-minio-artifact
? Value for param `tag` of type `string`? (Default is `1.31.0`) 1.31.0
? Value for param `namespace` of type `string`? (Default is `kubeflow`) kubeflow
? Value for param `bucket` of type `string`? (Default is `mlpipeline`) mlpipeline
Pipelinerun started: custom-artifact-location-pipeline-run-d2x6l
Waiting for logs to be available...
[foo : copy-artifacts] Added `storage` successfully.
[foo : copy-artifacts] `/tekton/results/output` -> `storage/mlpipeline/runs/custom-artifact-location-pipeline-run-d2x6l/custom-artifact-location-pipeline-run-d2x6l-foo-2v5gf-pod-86ll6/output.txt`
[foo : copy-artifacts] Total: 0 B, Transferred: 6 B, Speed: 565 B/s
...along with the file in showing up in Minio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this pipeline, you have to run under the same namespace as kfp, so by default you should run it under the kubeflow namespace. If you want to run on different namespace, then you have to create a secret with the s3 key and secret using the same secret_name
in the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking from a general integration test suite, it would be great if we can run this (and all other) pipeline as is. In order to use Tekton I am using a different default namespace. Can we just use the minio
and minio123
credentials since they are not really secret?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DSL is designed to take the credentials from secret, even the default setting is reading from a secret. Is it possible for you to create/copy the mlpipeline-minio-artifact
from kubeflow to your default namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For automation this would not be practical. Maybe we can run all pipelines in the kubeflow
namespace and clean up afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of CI tests in kubeflow assume the kubernetes namespace is setup with all the necessary dependencies such as gcp secret, image pull secret, service account, etc. Probably we should create some sort of CI setup script for Tekton as well. Many of the design decision I made is to align with how Argo behaves, so kfp can reuse the same set up for Tekton.
secret_access_key = storage_location.get("secretKeySecret", {"name": "mlpipeline-minio-artifact", "key": "secretkey"}) | ||
bucket = storage_location.get("bucket", "mlpipeline") | ||
copy_artifacts_step = { | ||
'image': 'minio/mc', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block is growing really large, should we move this into a separate funtion/method and maybe we could make the copy_artifacts_step
into a Python template string. In the future we could keep all the highly descriptive code snippets (YAML with a few parameters) in a separate folder as opposed to mixing it with the Python code. We could track that in a separate issue :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, there are big chunk of code templates introduced for resourceOp and artifacts because we need to use an extra step/task to support these additional features. There's still some merging happening for volume and volume snapshot op along with some new changes to resource op that Vincent did yesterday. I can open an issue to refactor this file and compiler.py.
@@ -45,37 +49,40 @@ spec: | |||
valueFrom: | |||
secretKeyRef: | |||
key: accesskey | |||
name: minio | |||
name: $(inputs.params.secret_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For automation this would not be practical. Maybe we can run all pipelines in the kubeflow
namespace and clean up afterwards.
@dsl.pipeline( | ||
name="custom_artifact_location_pipeline", | ||
description="""A pipeline to demonstrate how to configure the artifact | ||
location for all the ops in the pipeline.""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can add another comment here that this assumes the pipeline will be deployed/run in the kubeflow
namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GREAT! Thanks @Tomcli.
/lgtm
/assign @animeshsingh
@ckadner @animeshsingh Let me know if you have any question on this PR. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: animeshsingh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Add CR Ready Status Metrics
Which issue is resolved by this Pull Request:
Related #64
Description of your changes:
After several experiment and discussion at #64, I decided to use the minio/mc image to copy artifacts to the object storage. Since the artifact feature requires the kfp setup, I make it optional for now until we are running Tekton pipeline using kfp's api.
Also, since several Argo variables are not supported in Tekton, we have to workaround and get the pipelinerun and pod name from the metadata and pass it as environment variable. We also move the Argo variable parsing right before we export the yaml format to avoid substitution conflicts.
If we agree with this approach, then I will also implement the artifact inputs the same way in a follow up PR.
Environment tested:
python --version
): 3.6.4tkn version
): 0.11kubectl version
): 1.15/etc/os-release
):