Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [flytekit] Serialization fails for unknown types that should fall back to pickle #2823

Closed
2 tasks done
rahul-theorem opened this issue Aug 30, 2022 · 0 comments
Closed
2 tasks done
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Milestone

Comments

@rahul-theorem
Copy link
Contributor

Describe the bug

We encountered a couple of different failure modes when serializing workflows with the following types:

  1. Unknown type in a Union:

In general, Union[T, R] where T is a known type (ie. np.array or pd.DataFrame) and R is unknown fails. We encountered failures with the following types:

Union[np.ndarray, pd.DataFarme, scipy.sparse.spmatrix]
Union[np.ndarray, pd.DataFrame, lgbm.Dataset]
Union[np.ndarray, pd.DataFrame, pd.Series]

Example error:

  File "/data/.cache/d6b3232365b53688514105c765196062/execroot/alpha/bazel-out/k8-fastbuild/bin/thm/flyte/workflows/model_training/platform_rate_pipeline.register.runfiles/prod_flytekit/flytekit/core/type_engine.py", line 989, in get_literal_type
ValueError: Type of Generic Union type is not supported, Type <class 'scipy.sparse._base.spmatrix'> not supported currently in Flytekit. Please register a new transformer

Interestingly this fails even in the case where a transformer is registered.

  1. collections.abc

Any usage of types from collections.abc appears to lead to a type error in the code path to handle generics. Tested with the following types:

collections.abc.Sequence
collections.abc.Iterable
collections.abc.Callable
ValueError: Generic Type <class 'collections.abc.Sequence'> not supported currently in Flytekit.

Expected behavior

  1. Flyte can serialize types from collections.abc using pickle transport
  2. Unknown types in a Union can fall back to pickle transport

Additional context to reproduce

The following tasks reproduce the issue:

from typing import Union
from collections.abc import Sequence

from scipy import sparse
import numpy as np
import pandas as pd

@flytekit.task()
def test_task(data: Union[np.ndarray, pd.DataFrame, sparse.spmatrix) -> None:
    print(data)


@flytekit.task()
def test_task_2(input_list: Sequence[int]) -> Sequence[int]:
    return input_list

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@rahul-theorem rahul-theorem added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Aug 30, 2022
@pingsutw pingsutw self-assigned this Aug 30, 2022
@pingsutw pingsutw added this to the 1.2.0 milestone Aug 30, 2022
@pingsutw pingsutw added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

3 participants