Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON IDL #5542

Closed
wants to merge 42 commits into from
Closed

Conversation

Future-Outlier
Copy link
Member

@Future-Outlier Future-Outlier commented Jul 5, 2024

Tracking issue

#5318

Why are the changes needed?

  1. solve more than 10+ issues due to the protobuf struct doesn't support int.
  2. using msgpack bytes, which are smaller than JSON bytes, we can reduce storage size and improve I/O performance when interacting with database.

What changes were proposed in this pull request?

flyteidl

  • add JSON IDL type

flytepropeller

  • attribute access for remote execution
  • backward compatibility (compiler)
    upstreamType (Struct) -> downstreamType (Json)

flytectl

  • flytectl create execution with inputs
  • flytectl get execution

How was this patch tested?

unit tests and remote execution.
Example are from here

Setup process

Screenshots

[flytepropeller] attribute access

  • dataclass example
image

[flytepropeller] backward compatible

  • reference task
    upstream (struct) -> downstream (json)
image
  • cache
    upstream (struct) -> downstream (json)
image

[flytepropeller] others

dict example
image

[flytectl] create execution with inputs

(dev) future@outlier ~ % flytectl create execution --execFile build/PR/JSON/demo/flytectl/create_json_dataclass.yaml -p flytesnacks -d development

execution identifier project:"flytesnacks" domain:"development" name:"f6dee626da95445a8894"
image
iamRoleARN: ""
inputs:
    input:
        a: 1
        b: 3.14
        c: example_string
        d:
            "1": 100
            "2": 200
        e:
            a: 1
            b: 3.14
envs: {}
kubeServiceAcct: ""
targetDomain: ""
targetProject: ""
task: dataclass_example.dataclass_task
version: x1QUSQo64Vd9FuIo7xBkvg

[flytectl] get task

(dev) future@outlier ~ % flytectl get task -d development -p flytesnacks dataclass_example.dataclass_task --execFile build/PR/JSON/demo/flytectl/get_json_dataclass.yaml --version x1QUSQo64Vd9FuIo7xBkvg
 ------------------------ ---------------------------------- ------------- -------- --------- -------------- ------------------- ----------------------------- 
| VERSION                | NAME                             | TYPE        | INPUTS | OUTPUTS | DISCOVERABLE | DISCOVERY VERSION | CREATED AT                  |
 ------------------------ ---------------------------------- ------------- -------- --------- -------------- ------------------- ----------------------------- 
| x1QUSQo64Vd9FuIo7xBkvg | dataclass_example.dataclass_task | python-task | input  | o0      |              |                   | 2024-08-26T12:27:05.188306Z |
 ------------------------ ---------------------------------- ------------- -------- --------- -------------- ------------------- ----------------------------- 
1 rows
iamRoleARN: ""
inputs:
    input: []
envs: {}
kubeServiceAcct: ""
targetDomain: ""
targetProject: ""
task: dataclass_example.dataclass_task
version: x1QUSQo64Vd9FuIo7xBkvg

Note: For struct and JSON, the input will be an empty {} or []. This is an existing bug, and I am not planning to fix it in this PR. I genuinely believe this is a relatively low-priority issue."

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented Jul 5, 2024

Codecov Report

Attention: Patch coverage is 28.76712% with 104 lines in your changes missing coverage. Please review.

Project coverage is 34.61%. Comparing base (0da324b) to head (27dbc63).
Report is 177 commits behind head on master.

Files with missing lines Patch % Lines
flyteidl/gen/pb-go/flyteidl/core/literals.pb.go 12.72% 96 Missing ⚠️
flyteidl/clients/go/coreutils/literals.go 76.47% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5542      +/-   ##
==========================================
- Coverage   36.18%   34.61%   -1.57%     
==========================================
  Files        1302      990     -312     
  Lines      109666    85001   -24665     
==========================================
- Hits        39678    29424   -10254     
+ Misses      65841    52667   -13174     
+ Partials     4147     2910    -1237     
Flag Coverage Δ
unittests-datacatalog 51.37% <ø> (ø)
unittests-flyteadmin 55.63% <ø> (+0.30%) ⬆️
unittests-flytecopilot 12.17% <ø> (ø)
unittests-flytectl 62.17% <ø> (-0.02%) ⬇️
unittests-flyteidl 7.21% <28.76%> (+0.08%) ⬆️
unittests-flyteplugins 53.35% <ø> (+<0.01%) ⬆️
unittests-flytepropeller ?
unittests-flytestdlib 55.35% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wild-endeavor and others added 6 commits August 6, 2024 09:44
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Yee Hing Tong <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
…and support json idl attribute access

Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier Future-Outlier force-pushed the idl/json-type-bytes-and-attribute-access branch from 108350f to a503b8b Compare August 6, 2024 01:48
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier Future-Outlier changed the title [WIP][flyteIDL][flytepropeller] Support Json IDL and Json attribute access in workflow [WIP][flyteIDL][flytepropeller] Json IDL Aug 26, 2024
Comment on lines +48 to +50
if upstreamTypeCopy.GetSimple() == flyte.SimpleType_STRUCT && downstreamTypeCopy.GetSimple() == flyte.SimpleType_JSON {
return true
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for case like:

@reference_task(
    project="flytesnacks",
    domain="development",
    name="create_reference_task.creat_reference_task_dict",
    version="nNDnS6m4_fo-GXPxVUm2TA",
)
def t1(
) -> dict: # return STRUCT
    ...

@task(container_image=image)
def t2(input: dict) -> dict: # input STRUCT, return JSON
    print(input)
    for key, value in input.items():
        print(f'Key: {key}, Value: {value}')
        print(f'Key: {type(key)}, Value: {type(value)}')
    return input

@workflow
def wf() -> dict:
    d1 = t1()
    d2 = t2(input=d1)
    return d2

Comment on lines 589 to 597
if newT.Simple == core.SimpleType_JSON {
if _, isValueStringType := v.(string); !isValueStringType {
jsonBytes, err := json.Marshal(v)
if err != nil {
return nil, fmt.Errorf("unable to marshal to json string for json value %v", v)
}
jsonBytes, err = msgpack.Marshal(jsonBytes)
if err != nil {
return nil, fmt.Errorf("unable to marshal to msgpack bytes for json value %v", v)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for creating execution by flytectl.

flytectl create execution --execFile build/PR/JSON/demo/flytectl/create_json_dataclass.yaml -p flytesnacks -d development

@Future-Outlier Future-Outlier mentioned this pull request Aug 26, 2024
6 tasks
Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier Future-Outlier changed the title [WIP][flyteIDL][flytepropeller] Json IDL Json IDL Aug 27, 2024
@Future-Outlier Future-Outlier changed the title Json IDL JSON IDL Aug 27, 2024
Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier Future-Outlier marked this pull request as ready for review August 28, 2024 07:59
@Future-Outlier Future-Outlier self-assigned this Aug 28, 2024
Comment on lines +381 to +382
func TestResolveAttrPathInJson(t *testing.T) {
// Helper function to convert a map to JSON and then to msgpack
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Test the map type first, then test the list type.
  2. Within each category, order the tests by data type: integers (int), floats (float), strings (string), and nested maps or lists.
  3. Finally, arrange the exception case tests.

Signed-off-by: Future-Outlier <[email protected]>
flyteidl/protos/flyteidl/core/literals.proto Show resolved Hide resolved
}
currVal = convertNumbers(tmpVal)

// Turn the current value to a map so it can be resolved more easily
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, where did you turn the current value to a map?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my fault, nice catch!!

return literal, nil
}

// convertJSONToLiteral converts the protobuf struct (e.g. dataclass) to literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// convertJSONToLiteral converts the protobuf struct (e.g. dataclass) to literal
// convertJSONToLiteral converts the JSON (e.g. dataclass) to literal

@Future-Outlier
Copy link
Member Author

@eapolinario Can you review?
Thank you

Future-Outlier and others added 3 commits August 30, 2024 23:28
Signed-off-by: Future-Outlier <[email protected]>
Co-authored-by: Kevin Su <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Copy link
Contributor

@eapolinario eapolinario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on the use of github.com/vmihailenco/msgpack to handle msgpack-encoded values? I see that this is not listed in the official msgpack website.

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>

// The format used to serialize the byte array.
// This field identifies the specific format of the serialized JSON data,
// allowing future flexibility in supporting different JSON variants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment here saying that serialization formats need to be supported in python, golang, rust, and javascript, in order for the full flyte experience to work. (just throwing rust in there for now, we don't have anything yet).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, this is a pretty good and important advice, thank you.


// The format used to serialize the byte array.
// This field identifies the specific format of the serialized JSON data,
// allowing future flexibility in supporting different JSON variants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the json part of this comment, it's no longer true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thank you.

@Future-Outlier
Copy link
Member Author

Move to #5742

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants