Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SageMaker custom types for ParameterRangeOneOf and HyperparameterConfig. TunableParams extraction #189

Merged
merged 42 commits into from
Oct 16, 2020

Conversation

EngHabu
Copy link
Collaborator

@EngHabu EngHabu commented Sep 30, 2020

TL;DR

  • Creates custom types for ParameterRangeOneOf and HyperparameterJobConfig.
  • Use generics to serialize/deserialize protos to make UX visualization easier
  • Move tunable hyperparameters out of HyperparameterJobConfig to make it easier to bind (and consistent with Custom Container Training Job)

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Example:

simple_xgboost_hpo_job_task = hpo_job_task.SdkSimpleHyperparameterTuningJobTask(
    training_job=builtin_algorithm_training_job_task2,
    max_number_of_training_jobs=10,
    max_parallel_training_jobs=5,
    cache_version="1",
    retries=2,
    cacheable=True,
    tunable_parameters=["num_round", "max_depth", "gamma"],
)

    a = simple_xgboost_hpo_job_task(
        train=train_dataset,
        validation=validation_dataset,
        static_hyperparameters=static_hyperparameters,
        hyperparameter_tuning_job_config=hyperparameter_tuning_job_config,
        num_round=IntegerParameterRange(min_value=2, max_value=8, scaling_type=HyperparameterScalingType.LINEAR),
        max_depth=IntegerParameterRange(min_value=5, max_value=7, scaling_type=HyperparameterScalingType.LINEAR),
        gamma=ContinuousParameterRange(min_value=0.0, max_value=0.3, scaling_type=HyperparameterScalingType.LINEAR),
    )

Tracking Issue

flyteorg/flyte#455

@EngHabu EngHabu changed the title Various SageMaker types cleanup SageMaker custom types for ParameterRangeOneOf and HyperparameterConfig. TunableParams extraction Oct 5, 2020
@EngHabu EngHabu marked this pull request as ready for review October 5, 2020 05:36
@EngHabu EngHabu requested a review from bnsblue October 5, 2020 05:36
@EngHabu EngHabu requested a review from wild-endeavor October 8, 2020 14:59
@EngHabu EngHabu requested a review from bnsblue October 10, 2020 09:40
@EngHabu
Copy link
Collaborator Author

EngHabu commented Oct 13, 2020

Ping @wild-endeavor @bnsblue

validation=validation_dataset,
static_hyperparameters=static_hyperparameters,
hyperparameter_tuning_job_config=hyperparameter_tuning_job_config,
num_round=ParameterRangeOneOf(
Copy link
Contributor

@bnsblue bnsblue Oct 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it ParameterRangeOneOf rather than ParameterRange (which is defined in flytekit.common.tasks.sagemaker.types https://github.com/lyft/flytekit/pull/189/files#diff-0a303f3854a09b767e67d5f6eb857cbc49a6714f8df987f65646b98c5ef4d194R8)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of flytekit.common.tasks.sagemaker.types.ParameterRange?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That (in common.tasks.sagemaker.types) is just the "type" definition.. this is the value...
just like you use Types.String to define the type of an input but you then use "hello world" as the value...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see

validation=validation_dataset,
static_hyperparameters=static_hyperparameters,
hyperparameter_tuning_job_config=hyperparameter_tuning_job_config,
num_round=ParameterRangeOneOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it possible to make this even simpler by getting rid of the ParameterRangeOneOf() part? For example, Is it possible to make it as simple as num_round=IntegerParameterRange(...) ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to... any ideas? @wild-endeavor might have ideas too..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinging @wild-endeavor ^ please take a look

validation=validation_dataset,
static_hyperparameters=static_hyperparameters,
hyperparameter_tuning_job_config=hyperparameter_tuning_job_config,
num_round=ParameterRangeOneOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't think I understand the code so these might just be stupid questions. But when should the user use ParameterRangeOneOf and when should one use ParameterRange?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParameterRange is the type... you use it when you declare the interface of a task/WF... etc...
ParameterRangeOneOf is the value you can set for that type....

We can choose to name them the same but keep them in two different packages... (the flytekit way as far as I understand)...

@wild-endeavor might help shed some light on how to properly do this...

@bnsblue
Copy link
Contributor

bnsblue commented Oct 14, 2020

I left some questions and comments. I don't think I understand the code fully so those questions might be stupid.

@bnsblue
Copy link
Contributor

bnsblue commented Oct 15, 2020

I don't have further comments besides I really like the user to be able to write
num_round=IntegerParameterRange(...) instead of num_round=ParameterRangeOneOf(IntegerParameterRange(...))
#189 (comment)
Unfortunately I am not able to come up with a way to solve it, so I am going to approve the PR. Feel free to merge it if you don't think this is possible.

bnsblue
bnsblue previously approved these changes Oct 15, 2020
@EngHabu
Copy link
Collaborator Author

EngHabu commented Oct 15, 2020

Let me try something

@bnsblue
Copy link
Contributor

bnsblue commented Oct 15, 2020

Sure. I was hoping that creating a ParameterRangeBase and letting ParameterRangeOneOf and IntegerParameterRange etc all inherit from it can solve the problem, but I couldn't get it to work.

@bnsblue
Copy link
Contributor

bnsblue commented Oct 15, 2020

@EngHabu Are you done with what you wanted to try? Is that part ready for review?


ParameterRange = _sdk_types.Types.GenericProto(_pb2_parameter_ranges.ParameterRangeOneOf)
ParameterRange = _sdk_types.Types.GenericProto(_parameter_range_models.ParameterRangeOneOf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are wrapping a model instead, should we still call it Generic"Proto"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

models should be thought of as "glorified" protos (or pythonic-protos)... so same assumptions about them should apply...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

@bnsblue
Copy link
Contributor

bnsblue commented Oct 15, 2020

just a nit. LGTM

@EngHabu EngHabu merged commit 17ca24f into master Oct 16, 2020
max-hoffman pushed a commit to dolthub/flytekit that referenced this pull request May 11, 2021
…ig. TunableParams extraction (flyteorg#189)

Creates custom types for ParameterRangeOneOf and HyperparameterJobConfig.

Use generics to serialize/deserialize protos to make UX visualization easier

Move tunable hyperparameters out of HyperparameterJobConfig to make it easier to bind (and consistent with Custom Container Training Job)


* Typing

* ParameterRangeOneOf model

* cleanup

* lint

* lint

* unittests

* unit

* unit

* isort

* isort

* py 3.5

* lint

* lint

* re-add generic types to protos

* lint

* Remove generic

* remove T

* reformat

* fix import

* PR Comments

* lint

* lint

* PR Comments

* PR Comments

* lint

* unittest

* PR Comments

* lint

* remove deprecated fields

* Support converting raw protos through Types.*Proto classes

* lint

* revert notebook.py

* lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants