[Proposal]: Flyte System Tags and metadata #3320

kumare3 · 2023-02-07T01:16:31Z

This RFC proposes a way to add execution Tags and description.
this would help the user in multple ways. Please read the rfc for more context.

Signed-off-by: Ketan Umare [email protected]

Signed-off-by: Ketan Umare <[email protected]>

goyalankit

very useful feature!

rfc/system/0001-flyte-execution-tags.md

goyalankit · 2023-02-07T22:58:10Z

rfc/system/0001-flyte-execution-tags.md

+#### Approach 2: Certain label keys are treated special
+ - “group” will group everything
+ - “experiment” will also group everything with higher priority. 
+ - “name” will override the execution id with the name?


Are users allowed to change the labels? If they are then overriding might be an issue since you might have already fired async events to external systems. So I think it might be useful to maintain executionID as an identifier that can't be modified once execution has been created.

Alternatively, this could be an alias to the execution ID rather than overriding?

executionID cannot be changed - it is immutable and unique per project/domain.
name is just an alias. I will update the doc to reflect this

but i do like the idea of immutable labels as well. once added you cannot change them

So we'll support both mutable and immutable labels?

flixr

Nice! Mostly sounds good to me.
Just that I would go with the name tags on the flyte (cli) level.
Querying could still be done on k8s labels as well...

flixr · 2023-02-13T16:59:37Z

rfc/system/0001-flyte-execution-tags.md

+A workflow or task can be executed using
+
+```bash
+pyflyte run --remote --labels k:v --labels k1:v1 test.py wf --input1=10


I would probably call the arg here --tag to not confuse this with kubernetes labels.
And then probably assign the tags to k8s annotations.

great point, but the rpc field is sadly already called label. and these will become k8s labels

fg91 · 2023-03-30T17:35:03Z

rfc/system/0001-flyte-execution-tags.md

+
+
+## 7 Potential Impact and Dependencies
+We this this is one of the most requested features in Flyte and will solve


Suggested change

We this this is one of the most requested features in Flyte and will solve

This is one of the most requested features in Flyte and will solve

fg91 · 2023-03-30T17:37:25Z

rfc/system/0001-flyte-execution-tags.md

+available on each execution. The users are allowed to filter an exection simply
+by clicking on a label and then all executions are filtered by that label. 
+
+#### Approach 2: Certain label keys are treated special


If this approach is chosen, I wonder whether it would be nicer for the user to do

pyflyte run --remote --group foo --experiment bar ...

instead of

pyflyte run --remote --labels group:foo --labels experiment:bar ...

This doesn't mean that under the hood the labels mechanism couldn't be used.

I'm somewhat opposed to this as it could be confusing to users as to what is a label vs what is a keyword cli argument 🤔

Or do we wanna create group and experiment as CLI arguments and introduce them as a concept? 🤔

I personally prefer option 1: treat all labels the same way. Users might not want to follow the categories we deem sensible. Experiment tracking servers like Mlflow or Wandb, which also have such a tagging mechanism, simply allow users to assign arbitrary tags. I would argue that ML engineers are used to this and we should provide the same UX without imposing special naming conventions.

Only exception: execution name
I find it really helpful to have the pod names include customizable identifiers.
We have a registration script, similar to pyflyte run with has an --execution_name arg. The user provided value is appended with a random uuid, as is currently already chosen for the execution ids, and the result is checked against the execution name regex again and then passed to FlyteRemote.execute(execution_name=...) (already supported, see here). So I wouldn't treat execution name with a pod label but the pods metadata.name.

This comment is another argument for not treating execution names with labels but instead metadata.name since I agree that tags need to be mutable.

@bstadlbauer @fg91 @elibixby @flixr @goyalankit Some questions for you

Do you prefer key-value pair tags or tags that only have key?

Should we add tags to Kubernetes label?

Currently, execution spec (with labels) is serialized to byte and is stored in the execution table. it's impossible to add / delete / update tags. if we use k8s client to filtered flyteworkflow (CRD) by labels. we cannot search a execution after CR is deleted.

I have a PR that adds tags table. it allows us easily add / update / delete tags, and even attach tags to task / workflow / project. however, it's not key-value pair tags for now. If we decide to use key-value pair tags, I just need to add a new column to the tags table and update the query. I'd like to know your thought first.

btw, the current implementation works with both Mysql and Postgres.

My $0.02 is let's keep it simple and support what you call key-only tags.

A person can 'hack' this to resemble key-value if needed (ie 'costcenter-12'), but we don't need to manage that complexity on the back end or in the UI when we get to figuring out how to let folks use tags to sort/group things.

In my opinion key-only tags are perfectly fine and what ML engineers are used to from experiment tracking servers

Should we add tags to Kubernetes label

I think being able to add/delete/update tags after the execution has already started or ended is an important feature. User story: an experiment is training/trained really well and I want to mark it for later. This is something that is not known when starting the execution. But updating/deleting/adding tags when the execution is already running would mean that the k8s labels are not in sync with what is stored in the tags table. I'd therefore say that I wouldn't apply the tags as labels to k8s.

Sorry for the late response here but agreed with what's been said above. Key only tags would also solve all our usecases 👍

I think being able to add/delete/update tags after the execution has already started or ended is an important feature
+1 to this and the reasoning of not applying those to k8s

fg91

I like this proposal very much. Currently we use decks to link to Wandb runs corresponding to Flyte executions.
We then use tags in wandb to do the grouping by experiments, tags, ... that you describe in the RFC.
Would love to do this directly in Flyte.

davidmirror-ops · 2023-03-30T18:49:37Z

03-30-2023 Meeting notes:
KU: launchplan name could be the default grouping tag
Tim Sheiner: that's not very prominent right now in the UI
KU: we could add all to system tags but could be an overload
TS: make sure that this proposal is not redundant to the fact users are already naming workflows and tasks
TS: instead of treating this proposal as tags, treat it as a separate...
GG: their use case is running workflows from notebooks (using Quarto/Jupyter for reports)

tsheiner · 2023-03-30T18:54:44Z

Slightly perpendicular to this proposal but related to the notion of making executions easier to identify:

I note that Prefect uses a system of nonsense ids for executions which is much friendlier looking and far easier to remember than Flyte alphanumeric ids. For example ‘enigmatic-waxbill’ for ‘massive-antelope.’ Would something like this be possible for Flyte?

davidmirror-ops · 2023-04-13T18:18:19Z

04-13-203 notes: no updates

bstadlbauer

Looks good to me!

There is an ongoing slack conversation related to MySQL tag storage here - not 100% sure what that is about? cc @kumare3 @ByronHsu

fg91 · 2023-05-11T17:11:05Z

For visibility: @kasimiraula proposed that user should be able to create notes for executions, similar to what is currently possible when aborting an execution #3646

rfc/system/0001-flyte-execution-tags.md

kumare3 · 2023-05-25T18:17:30Z

Slightly perpendicular to this proposal but related to the notion of making executions easier to identify:

I note that Prefect uses a system of nonsense ids for executions which is much friendlier looking and far easier to remember than Flyte alphanumeric ids. For example ‘enigmatic-waxbill’ for ‘massive-antelope.’ Would something like this be possible for Flyte?

I think this is possible and I have thought about it. We should consider entropy considerations and cost of doing this and then all for it 👍🏽

…ions (#3727)

Signed-off-by: Kevin Su <[email protected]>

fg91

Thanks for incorporating what was discussed in the contributors' syncs. I like that now:

tags will not be attached to k8s objects as labels but instead saved in the database so that they can be modified/deleted during/after the execution.
all tags are equal and Flyte doesn't impose any special names.
we have a clear distinction between tags and notes.

Please consider all comments below as nit-picks that you can just resolve in case you don't agree.

rfc/system/0001-flyte-execution-tags.md

fg91 · 2023-07-11T17:11:56Z

rfc/system/0001-flyte-execution-tags.md

+A workflow or task can be executed using
+
+```bash
+pyflyte run --remote --tags '["hello", "world"]' test.py wf --input1=10


Could we do --tag hello --tag world instead of providing a list of tags in string format?

maybe we can support both?

pyflyte run --remote --tags '["hello", "world"]' # and pyflyte run --remote --tag hello --tag hello # and pyflyte run --remote --tag hello --tags '["key1", "key2"]'

My opinion about this is not so strong that I'd say we need to support two options in case others prefer --tags '["hello", "world"]' . I personally find lists or jsons in string representation as cli args a bit cumbersome.

rfc/system/0001-flyte-execution-tags.md

Co-authored-by: Fabio M. Graetz, Ph.D. <[email protected]> Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

fg91

LG!

kumare3 and others added 2 commits February 6, 2023 17:15

[Proposal]: Flyte System Tags and metadata

5742671

Signed-off-by: Ketan Umare <[email protected]>

added images

bc087ed

Signed-off-by: Ketan Umare <[email protected]>

goyalankit reviewed Feb 7, 2023

View reviewed changes

flixr reviewed Feb 13, 2023

View reviewed changes

davidmirror-ops added the rfc A label for RFC issues label Mar 29, 2023

fg91 reviewed Mar 30, 2023

View reviewed changes

fg91 previously approved these changes Mar 30, 2023

View reviewed changes

davidmirror-ops requested review from bstadlbauer and cosmicBboy March 31, 2023 15:44

bstadlbauer previously approved these changes May 11, 2023

View reviewed changes

fg91 reviewed May 11, 2023

View reviewed changes

rfc/system/0001-flyte-execution-tags.md Outdated Show resolved Hide resolved

fg91 reviewed May 11, 2023

View reviewed changes

rfc/system/0001-flyte-execution-tags.md Outdated Show resolved Hide resolved

elibixby reviewed May 24, 2023

View reviewed changes

rfc/system/0001-flyte-execution-tags.md Outdated Show resolved Hide resolved

Adapt execution tags RFC with results from contributor's sync discuss…

a5f6886

…ions (#3727)

kumare3 dismissed stale reviews from bstadlbauer and fg91 via a5f6886 May 26, 2023 13:09

This was referenced Jun 1, 2023

Flyte Execution tags flyteorg/flyteadmin#571

Merged

Add tags to execution spec flyteorg/flyteidl#414

Merged

pingsutw mentioned this pull request Jul 6, 2023

Add tags to execution flyteorg/flytekit#1723

Merged

8 tasks

pingsutw added 3 commits July 11, 2023 02:48

update doc

cfdba4f

Signed-off-by: Kevin Su <[email protected]>

Merge branch 'master' of github.com:flyteorg/flyte into flyte-tags

1d049e3

update doc

dd27c18

Signed-off-by: Kevin Su <[email protected]>

pingsutw requested review from flixr and goyalankit July 11, 2023 16:40

pingsutw requested review from bstadlbauer, ggydush, fg91, pingsutw and elibixby July 11, 2023 16:40

fg91 previously approved these changes Jul 11, 2023

View reviewed changes

pingsutw dismissed fg91’s stale review via dbb14fd July 12, 2023 03:52

pingsutw and others added 5 commits July 11, 2023 20:52

Update rfc/system/0001-flyte-execution-tags.md

dbb14fd

Co-authored-by: Fabio M. Graetz, Ph.D. <[email protected]> Signed-off-by: Kevin Su <[email protected]>

Update rfc/system/0001-flyte-execution-tags.md

fca779e

Co-authored-by: Fabio M. Graetz, Ph.D. <[email protected]> Signed-off-by: Kevin Su <[email protected]>

Update rfc/system/0001-flyte-execution-tags.md

ba8efc9

Co-authored-by: Fabio M. Graetz, Ph.D. <[email protected]> Signed-off-by: Kevin Su <[email protected]>

Update rfc/system/0001-flyte-execution-tags.md

b90586f

Co-authored-by: Fabio M. Graetz, Ph.D. <[email protected]> Signed-off-by: Kevin Su <[email protected]>

update doc

9eb5e22

Signed-off-by: Kevin Su <[email protected]>

pingsutw requested a review from fg91 July 12, 2023 15:24

fg91 approved these changes Jul 13, 2023

View reviewed changes

eapolinario approved these changes Jul 19, 2023

View reviewed changes

eapolinario merged commit 845d0f5 into master Jul 20, 2023

eapolinario deleted the flyte-tags branch July 20, 2023 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal]: Flyte System Tags and metadata #3320

[Proposal]: Flyte System Tags and metadata #3320

kumare3 commented Feb 7, 2023

goyalankit left a comment

goyalankit Feb 7, 2023 •

edited

Loading

kumare3 Feb 23, 2023

kumare3 Feb 23, 2023

ggydush Apr 27, 2023

flixr left a comment

flixr Feb 13, 2023

kumare3 Feb 23, 2023

fg91 Mar 30, 2023

fg91 Mar 30, 2023

bstadlbauer May 11, 2023

bstadlbauer May 11, 2023

fg91 May 11, 2023 •

edited

Loading

pingsutw Jun 8, 2023

tsheiner Jun 8, 2023

fg91 Jun 11, 2023

bstadlbauer Jun 22, 2023

fg91 left a comment

davidmirror-ops commented Mar 30, 2023

tsheiner commented Mar 30, 2023

davidmirror-ops commented Apr 13, 2023

bstadlbauer left a comment

fg91 commented May 11, 2023 •

edited

Loading

kumare3 commented May 25, 2023

fg91 left a comment •

edited

Loading

fg91 Jul 11, 2023

pingsutw Jul 12, 2023

pingsutw Jul 12, 2023

fg91 Jul 13, 2023

fg91 left a comment



		## 7 Potential Impact and Dependencies
		We this this is one of the most requested features in Flyte and will solve

	We this this is one of the most requested features in Flyte and will solve
	This is one of the most requested features in Flyte and will solve

[Proposal]: Flyte System Tags and metadata #3320

[Proposal]: Flyte System Tags and metadata #3320

Conversation

kumare3 commented Feb 7, 2023

goyalankit left a comment

Choose a reason for hiding this comment

goyalankit Feb 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flixr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fg91 May 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fg91 left a comment

Choose a reason for hiding this comment

davidmirror-ops commented Mar 30, 2023

tsheiner commented Mar 30, 2023

davidmirror-ops commented Apr 13, 2023

bstadlbauer left a comment

Choose a reason for hiding this comment

fg91 commented May 11, 2023 • edited Loading

kumare3 commented May 25, 2023

fg91 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fg91 left a comment

Choose a reason for hiding this comment

goyalankit Feb 7, 2023 •

edited

Loading

fg91 May 11, 2023 •

edited

Loading

fg91 commented May 11, 2023 •

edited

Loading

fg91 left a comment •

edited

Loading