Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

Cache eviction for single execution in datacatalog and flyteadmin #318

Merged
merged 6 commits into from
Nov 2, 2022

Conversation

MorpheusXAUT
Copy link
Contributor

@MorpheusXAUT MorpheusXAUT commented Sep 12, 2022

TL;DR

This PR prepares datacatalog and flyteadmin endpoints for cache eviction changes, overwriting existing artifacts for a single execution.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

The new UpdateArtifact RPC endpoint for the datacatalog service allows for an existing Artifact to be updated. At the moment, this update can only contain a (complete) replacement of the associated ArtifactData.
Artifacts can be identified using their ID or tag.

The new skip_cache flag in flyteadmin's ExecutionSpec, LaunchPlanSpec and WorkflowExecutionConfig are used to implement the cache eviction override for a single execution.

Tracking Issue

flyteorg/flyte#2867

Follow-up issue

NA

@codecov
Copy link

codecov bot commented Sep 27, 2022

Codecov Report

Merging #318 (52483c7) into master (cdbd93e) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #318   +/-   ##
=======================================
  Coverage   75.46%   75.46%           
=======================================
  Files          18       18           
  Lines        1174     1174           
=======================================
  Hits          886      886           
  Misses        237      237           
  Partials       51       51           
Flag Coverage Δ
unittests 75.11% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
clients/go/admin/authtype_enumer.go 27.27% <ø> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@MorpheusXAUT MorpheusXAUT changed the title Added datacatalog endpoint for updating artifacts Cache eviction for single execution in datacatalog and flyteadmin Sep 28, 2022
@MorpheusXAUT
Copy link
Contributor Author

Pushed additional IDL changes for flyteadmin, title/initial description have been adapted accordingly.

@MorpheusXAUT MorpheusXAUT force-pushed the datacatalog-update-artifact branch from 8da6991 to 7bb56c0 Compare October 3, 2022 07:37
@katrogan
Copy link
Contributor

katrogan commented Oct 4, 2022

hey @MorpheusXAUT this change looks good but I'm curious if you're planning on implementing the back-end changes to also include cache eviction and skipping the cache in the same PR per component. maybe it would make more sense to split these changes out individually (eviction & skipping)

also, before merging this would it be possible to have the other back-end PRs prepared and tested end to end to reduce churn in updating flyteidl (especially when it comes to potentially having to deprecate proto fields if the message structure ends up changing over the course of your testing)

@MorpheusXAUT
Copy link
Contributor Author

Hi @katrogan , thanks for taking a look!

hey @MorpheusXAUT this change looks good but I'm curious if you're planning on implementing the back-end changes to also include cache eviction and skipping the cache in the same PR per component. maybe it would make more sense to split these changes out individually (eviction & skipping)

Yeah, I was planning on making a PR about skipping cache only (and everything that's required for that) and open another one for the second part of the implementing (evicting cached data).

also, before merging this would it be possible to have the other back-end PRs prepared and tested end to end to reduce churn in updating flyteidl (especially when it comes to potentially having to deprecate proto fields if the message structure ends up changing over the course of your testing)

Sure thing! I'm mostly done with the other PRs, currently finishing up flytectl and flyteconsole client implementations. The rest of the backend implementation is done, I'll open PRs for those tomorrow (using replace to our forks in go.mod to it'll build until this PR is merged).

I'll mention this PR in the others as well so you'll have a quick overview of all changes once they're open.

@MorpheusXAUT
Copy link
Contributor Author

@katrogan All other PRs have been submitted, created as drafts for now since they depend on this PR/each other.

Pushed some minor changes to protos as well today as they were required for the flytectl changes.

Copy link
Contributor

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments on the flytepropeller PR - Basically, I'm concerned about using skipCache for the naming of this feature. Really we are skipping the cache lookup but overwriting the existing cache values - would overwriteCache be a better indicator of the functionality? I think leaving CACHE_SKIPPED status during the lookup is correct and the rest of this PR lgtm. Open for discussion on this.

Nick Müller added 6 commits November 2, 2022 14:08
Existing artifacts can have their associated ArtifactData overwritten

Signed-off-by: Nick Müller <[email protected]>
@MorpheusXAUT MorpheusXAUT force-pushed the datacatalog-update-artifact branch from 1172311 to 52483c7 Compare November 2, 2022 15:12
@hamersaw hamersaw merged commit 446b575 into flyteorg:master Nov 2, 2022
@MorpheusXAUT MorpheusXAUT deleted the datacatalog-update-artifact branch November 2, 2022 22:02
bstadlbauer pushed a commit to bstadlbauer/flyteidl that referenced this pull request Nov 14, 2022
…yteorg#318)

* Added datacatalog endpoint for updating artifacts
Existing artifacts can have their associated ArtifactData overwritten

Signed-off-by: Nick Müller <[email protected]>

* datacatalog.UpdateArtifact returns ArtifactID

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache override to ExecutionSpec, LaunchPlanSpec and WorkflowExecutionConfig

Signed-off-by: Nick Müller <[email protected]>

* Added CatalogCacheStatus for skipped cache lookups

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache flag to ExecutionRelaunchRequest

Signed-off-by: Nick Müller <[email protected]>

* Renamed skip_cache flag to overwrite_cache

Signed-off-by: Nick Müller <[email protected]>

Signed-off-by: Nick Müller <[email protected]>
Signed-off-by: Bernhard <[email protected]>
eapolinario pushed a commit that referenced this pull request Sep 8, 2023
* Added datacatalog endpoint for updating artifacts
Existing artifacts can have their associated ArtifactData overwritten

Signed-off-by: Nick Müller <[email protected]>

* datacatalog.UpdateArtifact returns ArtifactID

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache override to ExecutionSpec, LaunchPlanSpec and WorkflowExecutionConfig

Signed-off-by: Nick Müller <[email protected]>

* Added CatalogCacheStatus for skipped cache lookups

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache flag to ExecutionRelaunchRequest

Signed-off-by: Nick Müller <[email protected]>

* Renamed skip_cache flag to overwrite_cache

Signed-off-by: Nick Müller <[email protected]>

Signed-off-by: Nick Müller <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants