Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing dataclasses which require list protobuf deserialization #2614

Closed
wants to merge 5 commits into from

Conversation

JackUrb
Copy link
Contributor

@JackUrb JackUrb commented Jul 26, 2024

Tracking issue

closes flyteorg/flyte/issues/5593
closes flyteorg/flyte/issues/4581

Why are the changes needed?

As described in flyteorg/flyte/issues/5593, when handling the protobuf serialization/deserialization for a dataclass, attributes that are lists end up being deserialized to lists rather than dicts (despite the function being _MessageToDict).

What changes were proposed in this pull request?

The primary change is to the ProtobufTransformer's to_literal method. If the _MessageToDict method returns a list instead of a dict, rather than try to construct a protobuf Struct (which can't be done for a list), we use a separate _handle_list_literal method, which ultimately treats the processing more similarly to the ListTransformer.

An additional change is made to the assert_type method to avoid an issue where the isinstance clause was unnecessarily evaluated (due to the logic ordering) in cases where the type t could be a generic (for which the assertion error isinstance() argument 2 cannot be a parameterized generic would be thrown). Swapping the logic order means that if get_origin is not None (which occurs in parameterized generics), the isinstance will not be called.

How was this patch tested?

Adds a new test to the TypeEngine testing, test_DataclassTransformer_for_list_attributes.

We've also been running this change locally for some time, and it has been stable.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Copy link

codecov bot commented Aug 6, 2024

Codecov Report

Attention: Patch coverage is 6.66667% with 14 lines in your changes missing coverage. Please review.

Project coverage is 45.00%. Comparing base (556dad2) to head (c411fc7).
Report is 169 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/core/type_engine.py 6.66% 14 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2614       +/-   ##
===========================================
- Coverage   94.08%   45.00%   -49.09%     
===========================================
  Files          32      223      +191     
  Lines        1842    20651    +18809     
  Branches        0     4012     +4012     
===========================================
+ Hits         1733     9293     +7560     
- Misses        109    11230    +11121     
- Partials        0      128      +128     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. overall lgtm. Some tests are failing in python 3.9. mind taking a look

Comment on lines +722 to +725
struct.update(message_dict)
except Exception:
if isinstance(message_dict, list):
return self._handle_list_literal(ctx, message_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we do this?

if isinstance(message_dict, list):
    return self._handle_list_literal(ctx, message_dict)
else:
    struct.update(message_dict)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like yes, will update

@JackUrb
Copy link
Contributor Author

JackUrb commented Aug 19, 2024

@pingsutw I'm having a hard time telling quite why the test is failing in the narrow case of py3.9 tensorflow file conversion. Is there any retry logic somewhere else in flyte that effectively checks for successful conversion, and then does something else on failure? It's hard to navigate why this conversion is this specific case breaks, and the FlyteFile _SpecificFormatClass portion is pretty hard to understand.

@JackUrb
Copy link
Contributor Author

JackUrb commented Aug 22, 2024

Note, it's possible this is entirely superseded by #2600 - @Future-Outlier happy to assist on that effort if extra hands would help.

@Future-Outlier
Copy link
Member

Note, it's possible this is entirely superseded by #2600 - @Future-Outlier happy to assist on that effort if extra hands would help.

@JackUrb
I just back from the military service!
I'm in charge of JSON IDL, do you want to collaborate with me? or help review my PR?
I can hop on a call with you.

@JackUrb
Copy link
Contributor Author

JackUrb commented Nov 7, 2024

Indeed, this entire issue is resolved by flyteorg/flyte#5742. Very excited for v1.14.0 to come out! Thanks @Future-Outlier

@JackUrb JackUrb closed this Nov 7, 2024
@JackUrb JackUrb deleted the ju-flyte-dataclass-lists branch November 7, 2024 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants