Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

pytorch plugin implementation #85

Merged
merged 6 commits into from
May 28, 2020
Merged

Conversation

igorvalko
Copy link
Contributor

@igorvalko igorvalko commented May 8, 2020

TL;DR

Pytorch plugin for Flyte that leverages https://github.com/kubeflow/pytorch-operator to run distributed pytorch jobs on k8s.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Pytorch plugin for Flyte that leverages https://github.com/kubeflow/pytorch-operator to run distributed pytorch jobs on k8s.
Related PRs:
flyteorg/flyteidl#61
flyteorg/flytekit#112

Tracking Issue

NA

Follow-up issue

NA

go.mod Show resolved Hide resolved
@kumare3
Copy link
Contributor

kumare3 commented May 18, 2020

This is great stuff @igorvalko, once you change the flyteidl, you have a +1 from me

workers := pytorchTaskExtraArgs.GetWorkers()

jobSpec := ptOp.PyTorchJobSpec{
TTLSecondsAfterFinished: nil, //60, // cleanup in 60 seconds after the job completion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya you dont need to cleanup, we will automatically

"github.com/lyft/flyteplugins/go/tasks/logs"
logUtils "github.com/lyft/flyteidl/clients/go/coreutils/logs"

//commonOp "github.com/kubeflow/common/pkg/apis/common/v1" // switch to real 'common' once https://github.com/kubeflow/pytorch-operator/issues/263 resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting call out. I think after this for all kubeflow operators, we can just have a flyte-kubeflow-commons where all status checking can be done inline

@kumare3 kumare3 self-requested a review May 28, 2020 20:45
@codecov-commenter
Copy link

Codecov Report

Merging #85 into master will increase coverage by 0.94%.
The diff coverage is 77.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #85      +/-   ##
==========================================
+ Coverage   54.97%   55.92%   +0.94%     
==========================================
  Files          91       93       +2     
  Lines        4660     4787     +127     
==========================================
+ Hits         2562     2677     +115     
- Misses       1821     1827       +6     
- Partials      277      283       +6     
Impacted Files Coverage Δ
...o/tasks/plugins/k8s/kfoperators/pytorch/pytorch.go 77.68% <77.68%> (ø)
go/tasks/plugins/hive/config/config.go 27.27% <0.00%> (ø)
go/tasks/pluginmachinery/utils/error_collection.go 100.00% <0.00%> (ø)
go/tasks/plugins/hive/execution_state.go 70.61% <0.00%> (+0.12%) ⬆️
...tasks/pluginmachinery/flytek8s/container_helper.go 83.33% <0.00%> (+0.28%) ⬆️
go/tasks/plugins/hive/config/config_flags.go 41.66% <0.00%> (+2.53%) ⬆️
go/tasks/pluginmachinery/utils/template.go 71.25% <0.00%> (+9.05%) ⬆️
go/tasks/pluginmachinery/utils/transformers.go 100.00% <0.00%> (+50.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d46c654...fef8dcd. Read the comment docs.

Copy link
Contributor

@kumare3 kumare3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kumare3 kumare3 merged commit 74715c3 into flyteorg:master May 28, 2020
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants