-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolver pattern #404
Resolver pattern #404
Conversation
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
@@ -158,11 +156,6 @@ def serialize_all( | |||
) | |||
old_style_entities.append(o) | |||
|
|||
# PythonInstanceTasks will not be picked up by the above, so we need to reiterate | |||
for o, v in load_module_object_for_type(pkgs, PythonInstanceTask, additional_path=local_source_root).items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to double check, we don't need this for #391?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, getting rid of it, using the instance tracking metaclass now.
Co-authored-by: Katrina Rogan <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
flyteorg/flyte#814 end to end tests. |
@@ -5,6 +5,7 @@ | |||
import pathlib | |||
import random as _random | |||
import traceback as _traceback | |||
from typing import List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so hypothetically if we were to use the TaskResolver to execute fast-registration, then in that case, we probably need some way of cascading the resolver down the dynamic chain right?
I understand by definition at the moment TaskResolver
is bound to a Task
. but in the case of dynamic task, would it automatically cascade downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think about this more. Can I do this as a separate PR? I'll write up an issue... also because fast-register doesn't work right now with dynamic tasks.
"--inputs", | ||
"{{.input}}", | ||
"--output-prefix", | ||
"{{.outputPrefix}}", | ||
"--raw-output-data-prefix", | ||
"{{.rawOutputDataPrefix}}", | ||
"--resolver", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this implies that the resolver has to be the last arg only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use multi-valued args?
--resolver-arg "x=y" --resolver-arg "..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean. the resolver can be specified anywhere, but the resolver args have to be at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @kumare3 means an alternative syntax is
--resolver ... --resolver-arg "task-module=<TASK_MODULE>" -- resolver-arg "task-name=<TASK_NAME>"
Where --resolver-arg
is a click option instead of an argument.
Will the resolver always have 2 args, task-module
and task-name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just throwing out another idea:
--resolver ... --resolver-task-module ... --resolver task-name ...
"--inputs", | ||
"{{.input}}", | ||
"--output-prefix", | ||
"{{.outputPrefix}}", | ||
"--raw-output-data-prefix", | ||
"{{.rawOutputDataPrefix}}", | ||
"--resolver", | ||
self.task_resolver.location, | ||
"--", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i really prefer --resolver-arg type of usage, why do you want this raw command line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spent half an hour getting it to work with click and gave up?
This looks great to me, couple nits |
|
||
task_module = importlib.import_module(task_module) | ||
task_def = getattr(task_module, task_name) | ||
return task_def |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
task_def
and task_module
vars only used in once, consider return getattr(importlib.import_module(task_module), task_name)
""" | ||
return func.__code__.co_flags & inspect.CO_NESTED != 0 | ||
|
||
container_args = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do
return [
"pyflyte-execute",
...
"--",
*self.task_resolver.loader_args(settings, self),
]
@@ -244,20 +144,23 @@ def execute(self, **kwargs) -> Any: | |||
return self.dynamic_execute(self._task_function, **kwargs) | |||
|
|||
def get_command(self, settings: SerializationSettings) -> List[str]: | |||
return [ | |||
container_args = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do
return [
"pyflyte-execute",
...
"--",
*self.task_resolver.loader_args(settings, self),
]
) | ||
|
||
|
||
class TaskResolverMixin(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python 3+: object
is default parent class
class TaskResolverMixin(object): | |
class TaskResolverMixin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like keeping it though, makes it explicit and it's everywhere else.
"--inputs", | ||
"{{.input}}", | ||
"--output-prefix", | ||
"{{.outputPrefix}}", | ||
"--raw-output-data-prefix", | ||
"{{.rawOutputDataPrefix}}", | ||
"--resolver", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @kumare3 means an alternative syntax is
--resolver ... --resolver-arg "task-module=<TASK_MODULE>" -- resolver-arg "task-name=<TASK_NAME>"
Where --resolver-arg
is a click option instead of an argument.
Will the resolver always have 2 args, task-module
and task-name
?
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Signed-off-by: wild-endeavor <[email protected]>
Co-authored-by: Niels Bantilan <[email protected]>
Just to answer the above - the resolver arg is not an option yeah, it's a click arg. The number of arguments is unknown except that we require it to be >=1. it could be just one or someone may make a new resolver a year from now that has a hundred args. |
Would it be useful to add a warning about
|
|
TL;DR
This changes the execution side of flytekit. Previously tasks were loaded only with a module and key to look up through an
importlib
call. This introduces a "task resolver" construct which is responsible for two things:Type
Are all requirements met?
Complete description
TaskResolverMixin
.entrypoint.py
click command has been updated to be able to handle that command.As part of this change, it was found that it was useful to be able to keep track of the variable that a task resolver was assigned to. Because of this, we've added back (duplicated rather) the instance-tracking metaclass mechanism from the old API (the stuff in
registerable.py
). See the new tests under thetracking/
folder for a non-Flyte-based example of how it works.entrypoint.py
execute function has been separated out.PythonAutoContainerTask
has been moved into a separate file.PythonAutoContainerTasks
are now aTrackedInstance
we no longer need theInstanceVar
construct.Tracking Issue
https://github.com/lyft/flyte/issues/
Follow-up issue
NA