-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeTransformers for PyTorch Tensor, Module, and Checkpoint #1032
Merged
Merged
Changes from 15 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
4974353
TypeTransformers for PyTorch Tensor and Module
samhita-alla fa5e3ae
add torch to requirements
samhita-alla 6d8f977
add module as a native type and PyTorchCheckpoint
samhita-alla 3560adf
resolve merge conflict
samhita-alla bf61ef7
update requirements
samhita-alla f4b8704
resolve merge conflict
samhita-alla 7b9218f
procedural to OOP approach
samhita-alla 6332364
nit
samhita-alla 9f098ba
verify device conversion
samhita-alla b8111a6
verify device conversion
samhita-alla 13fd0f7
hyperparameters can be None
samhita-alla 7b299f4
device conversion
samhita-alla 4ef601c
device conversion
samhita-alla c8d5478
checkpoint code cleanup
samhita-alla b644a2e
resolve merge conflict
samhita-alla 88a6ce1
move pytorch from types to extra; resolve merge conflict
samhita-alla 6915913
fix pytorch api reference; resolve merge conflict
samhita-alla 6ce6e19
fix pytorch import
samhita-alla a48472a
fix merge conflict
samhita-alla File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,3 +11,4 @@ codespell | |
google-cloud-bigquery | ||
google-cloud-bigquery-storage | ||
IPython | ||
torch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,3 +33,4 @@ papermill # papermill | |
jupyter # papermill | ||
pyspark # spark | ||
sqlalchemy # sqlalchemy | ||
torch # pytorch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. automodule:: flytekit.types.pytorch | ||
:no-members: | ||
:no-inherited-members: | ||
:no-special-members: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1 @@ | ||
""" | ||
Flytekit Numpy | ||
============== | ||
.. currentmodule:: flytekit.types.numpy | ||
|
||
.. autosummary:: | ||
:template: custom.rst | ||
:toctree: generated/ | ||
|
||
NumpyArrayTransformer | ||
""" | ||
|
||
from .ndarray import NumpyArrayTransformer |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
""" | ||
Flytekit PyTorch | ||
========================================= | ||
.. currentmodule:: flytekit.types.pytorch | ||
|
||
.. autosummary:: | ||
:template: custom.rst | ||
:toctree: generated/ | ||
|
||
PyTorchCheckpoint | ||
""" | ||
from flytekit.loggers import logger | ||
|
||
try: | ||
from .checkpoint import PyTorchCheckpoint, PyTorchCheckpointTransformer | ||
from .native import PyTorchModuleTransformer, PyTorchTensorTransformer | ||
except ImportError: | ||
logger.info( | ||
"We won't register PyTorchCheckpointTransformer, PyTorchTensorTransformer, and PyTorchModuleTransformer because torch is not installed." | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
import pathlib | ||
import typing | ||
from dataclasses import asdict, dataclass, fields, is_dataclass | ||
from typing import Any, Callable, Dict, NamedTuple, Optional, Type, Union | ||
|
||
import torch | ||
from dataclasses_json import dataclass_json | ||
|
||
from flytekit.core.context_manager import FlyteContext | ||
from flytekit.core.type_engine import TypeEngine, TypeTransformer, TypeTransformerFailedError | ||
from flytekit.models.core import types as _core_types | ||
from flytekit.models.literals import Blob, BlobMetadata, Literal, Scalar | ||
from flytekit.models.types import LiteralType | ||
|
||
try: | ||
from typing import Protocol | ||
except ImportError: | ||
from typing_extensions import Protocol | ||
|
||
|
||
class IsDataclass(Protocol): | ||
__dataclass_fields__: Dict | ||
__dataclass_params__: Dict | ||
__post_init__: Optional[Callable] | ||
|
||
|
||
@dataclass_json | ||
@dataclass | ||
class PyTorchCheckpoint: | ||
""" | ||
This class is helpful to save a checkpoint. | ||
""" | ||
|
||
module: Optional[torch.nn.Module] = None | ||
hyperparameters: Optional[Union[Dict[str, Any], NamedTuple, IsDataclass]] = None | ||
optimizer: Optional[torch.optim.Optimizer] = None | ||
|
||
def __post_init__(self): | ||
if not ( | ||
isinstance(self.hyperparameters, dict) | ||
or (is_dataclass(self.hyperparameters) and not isinstance(self.hyperparameters, type)) | ||
or (isinstance(self.hyperparameters, tuple) and hasattr(self.hyperparameters, "_fields")) | ||
or (self.hyperparameters is None) | ||
): | ||
raise TypeTransformerFailedError( | ||
f"hyperparameters must be a dict, dataclass, or NamedTuple. Got {type(self.hyperparameters)}" | ||
) | ||
|
||
if not (self.module or self.hyperparameters or self.optimizer): | ||
raise TypeTransformerFailedError("Must have at least one of module, hyperparameters, or optimizer") | ||
|
||
|
||
class PyTorchCheckpointTransformer(TypeTransformer[PyTorchCheckpoint]): | ||
""" | ||
TypeTransformer that supports serializing and deserializing checkpoint. | ||
""" | ||
|
||
PYTORCH_CHECKPOINT_FORMAT = "PyTorchCheckpoint" | ||
|
||
def __init__(self): | ||
super().__init__(name="PyTorch Checkpoint", t=PyTorchCheckpoint) | ||
|
||
def get_literal_type(self, t: Type[PyTorchCheckpoint]) -> LiteralType: | ||
return LiteralType( | ||
blob=_core_types.BlobType( | ||
format=self.PYTORCH_CHECKPOINT_FORMAT, dimensionality=_core_types.BlobType.BlobDimensionality.SINGLE | ||
) | ||
) | ||
|
||
def to_literal( | ||
self, | ||
ctx: FlyteContext, | ||
python_val: PyTorchCheckpoint, | ||
python_type: Type[PyTorchCheckpoint], | ||
expected: LiteralType, | ||
) -> Literal: | ||
meta = BlobMetadata( | ||
type=_core_types.BlobType( | ||
format=self.PYTORCH_CHECKPOINT_FORMAT, dimensionality=_core_types.BlobType.BlobDimensionality.SINGLE | ||
) | ||
) | ||
|
||
local_path = ctx.file_access.get_random_local_path() + ".pt" | ||
pathlib.Path(local_path).parent.mkdir(parents=True, exist_ok=True) | ||
|
||
to_save = {} | ||
for field in fields(python_val): | ||
value = getattr(python_val, field.name) | ||
|
||
if value and field.name in ["module", "optimizer"]: | ||
to_save[field.name + "_state_dict"] = getattr(value, "state_dict")() | ||
elif value and field.name == "hyperparameters": | ||
if isinstance(value, dict): | ||
to_save.update(value) | ||
elif isinstance(value, tuple): | ||
to_save.update(value._asdict()) | ||
elif is_dataclass(value): | ||
to_save.update(asdict(value)) | ||
|
||
if not to_save: | ||
raise TypeTransformerFailedError(f"Cannot save empty {python_val}") | ||
|
||
# save checkpoint to a file | ||
torch.save(to_save, local_path) | ||
|
||
remote_path = ctx.file_access.get_random_remote_path(local_path) | ||
ctx.file_access.put_data(local_path, remote_path, is_multipart=False) | ||
return Literal(scalar=Scalar(blob=Blob(metadata=meta, uri=remote_path))) | ||
|
||
def to_python_value( | ||
self, ctx: FlyteContext, lv: Literal, expected_python_type: Type[PyTorchCheckpoint] | ||
) -> PyTorchCheckpoint: | ||
try: | ||
uri = lv.scalar.blob.uri | ||
except AttributeError: | ||
TypeTransformerFailedError(f"Cannot convert from {lv} to {expected_python_type}") | ||
|
||
local_path = ctx.file_access.get_random_local_path() | ||
ctx.file_access.get_data(uri, local_path, is_multipart=False) | ||
|
||
# cpu <-> gpu conversion | ||
if torch.cuda.is_available(): | ||
map_location = "cuda:0" | ||
else: | ||
map_location = torch.device("cpu") | ||
|
||
# load checkpoint from a file | ||
return typing.cast(PyTorchCheckpoint, torch.load(local_path, map_location=map_location)) | ||
|
||
def guess_python_type(self, literal_type: LiteralType) -> Type[PyTorchCheckpoint]: | ||
if ( | ||
literal_type.blob is not None | ||
and literal_type.blob.dimensionality == _core_types.BlobType.BlobDimensionality.SINGLE | ||
and literal_type.blob.format == self.PYTORCH_CHECKPOINT_FORMAT | ||
): | ||
return PyTorchCheckpoint | ||
|
||
raise ValueError(f"Transformer {self} cannot reverse {literal_type}") | ||
|
||
|
||
TypeEngine.register(PyTorchCheckpointTransformer()) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is removing this from docs intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I don't think we'd want to have Transformer in the API reference cause the methods within the TypeTransformer class remain the same.