Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [flyteconsole] Can't launch task with nested dataclasses as input #3110

Open
2 tasks done
alex-dr opened this issue Nov 29, 2022 · 10 comments
Open
2 tasks done
Assignees
Labels
bug Something isn't working stale ui Admin console user interface
Milestone

Comments

@alex-dr
Copy link

alex-dr commented Nov 29, 2022

Describe the bug

I'm having an issue launching tasks from the Flyte Console with nested dataclasses as an input.

My workflow itself runs fine from the CLI and the task inputs/metadata can be viewed as expected.

However, the RERUN button from a workflow task, or Launch Task from the Tasks interface seems to arbitrarily either:

  1. Show an error message in the UI saying Could not find a definition for #/definitions/LiteralsonlySchema (see reproducer below)
  2. Offer a 'Create New Execution' modal box that prompts for the foo and bar keys of the LiteralsOnly dataclass (which leads to a deserialization error at runtime if you try it, because those are not the appropriate task inputs).

Cases 1 and 2 both appear when launching a workflow or a task if the inputs include nested dataclasses.

Case 2 only appears to happen after a couple page refreshes and navigating around. Once Case 1 appears it seems to get stuck on that error mode.

Expected behavior

I expect to be able to create arbitrary levels of nested dataclasses and to be able to run tasks and workflows taking these types as inputs from the Flyte Console.

If the UI form cannot present arbitrary levels of nested objects, I am fine with pasting in a blob of JSON.

Additional context to reproduce

Simple reproducer script - the literals_only task works in the UI just fine, but the dataclasses_only task does not.

from dataclasses import dataclass
from dataclasses_json import dataclass_json
from flytekit import task, workflow

@dataclass_json
@dataclass
class LiteralsOnly:
    foo: str
    bar: int

@dataclass_json
@dataclass
class DataClassesOnly:
    first: LiteralsOnly
    second: LiteralsOnly

@task
def literals_only(x: LiteralsOnly):
    pass

@task
def dataclasses_only(y: DataClassesOnly):
    pass

@workflow
def runner() -> None:
    x = LiteralsOnly(foo="foo", bar=2)
    literals_only(x=x)
    dataclasses_only(y=DataClassesOnly(first=x, second=x))
  1. Submit workflow with pyflyte and observe successful execution
  2. Navigate to Tasks view and select the dataclasses_only task
  3. Click Launch Task - see invalid task configuration screen (Case 2)
  4. Cancel
  5. Click Launch Task and Cancel again a couple times
  6. See JSONSchema error (Case 1)

My environment includes:

flyteidl                   1.2.5
flytekit                   1.2.4
flytekitplugins-pod        1.2.4
dataclasses-json           0.5.7
fastjsonschema             2.15.3
jsonschema                 4.6.0
marshmallow-jsonschema     0.13.0
python-json-logger         2.0.4

cr.flyte.org/flyteorg/flyte-sandbox-lite   sha-bfa1dd4e6057b6fc16272579d61df7b1832b96a7

Screenshots

Case 1:
image

Case 2:
image

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@alex-dr alex-dr added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Nov 29, 2022
@welcome
Copy link

welcome bot commented Nov 29, 2022

Thank you for opening your first issue here! 🛠

@alex-dr
Copy link
Author

alex-dr commented Nov 29, 2022

If rampant speculation is appropriate - the issue could be one of:

  • the jsonschema document being 'literalized' by Flyte into definitions.structValue.fields.LiteralsonlySchema.structValue.fields... etc breaks the $ref
  • the document is not properly 'de-literalized' before passing it to JSONSchema
  • it's passing the wrong level of this document to the JSONSchema parser, so definitions is not the document root.
{
    "y": {
        "type": {
            "metadata": {
                "fields": {
                    "$ref": {
                        "stringValue": "#/definitions/DataclassesonlySchema"
                    },
                    "definitions": {
                        "structValue": {
                            "fields": {
                                "DataclassesonlySchema": {
                                    "structValue": {
                                        "fields": {
                                            "type": {
                                                "stringValue": "object"
                                            },
                                            "properties": {
                                                "structValue": {
                                                    "fields": {
                                                        "first": {
                                                            "structValue": {
                                                                "fields": {
                                                                    "type": {
                                                                        "stringValue": "object"
                                                                    },
                                                                    "field_many": {
                                                                        "boolValue": false
                                                                    },
                                                                    "$ref": {
                                                                        "stringValue": "#/definitions/LiteralsonlySchema"
                                                                    }
                                                                }
                                                            }
                                                        },
                                                        "second": {
                                                            "structValue": {
                                                                "fields": {
                                                                    "$ref": {
                                                                        "stringValue": "#/definitions/LiteralsonlySchema"
                                                                    },
                                                                    "type": {
                                                                        "stringValue": "object"
                                                                    },
                                                                    "field_many": {
                                                                        "boolValue": false
                                                                    }
                                                                }
                                                            }
                                                        }
                                                    }
                                                }
                                            },
                                            "additionalProperties": {
                                                "boolValue": false
                                            }
                                        }
                                    }
                                },
                                "LiteralsonlySchema": {
                                    "structValue": {
                                        "fields": {
                                            "properties": {
                                                "structValue": {
                                                    "fields": {
                                                        "foo": {
                                                            "structValue": {
                                                                "fields": {
                                                                    "title": {
                                                                        "stringValue": "foo"
                                                                    },
                                                                    "type": {
                                                                        "stringValue": "string"
                                                                    }
                                                                }
                                                            }
                                                        },
                                                        "bar": {
                                                            "structValue": {
                                                                "fields": {
                                                                    "type": {
                                                                        "stringValue": "integer"
                                                                    },
                                                                    "title": {
                                                                        "stringValue": "bar"
                                                                    }
                                                                }
                                                            }
                                                        }
                                                    }
                                                }
                                            },
                                            "additionalProperties": {
                                                "boolValue": false
                                            },
                                            "type": {
                                                "stringValue": "object"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    },
                    "$schema": {
                        "stringValue": "http://json-schema.org/draft-07/schema#"
                    }
                }
            },
            "simple": 9
        },
        "description": "y"
    }
}

@alex-dr
Copy link
Author

alex-dr commented Nov 30, 2022

One interesting line is here:
https://github.com/flyteorg/flyteconsole/blob/ab19c5433c0b1c01ae7f414a7221aa32c5b46066/packages/zapp/console/src/components/Launch/LaunchForm/StructInput.tsx#L66-L78

  if (literalType?.metadata?.fields?.definitions?.structValue?.fields) {
    const keys = Object.keys(literalType?.metadata?.fields?.definitions?.structValue?.fields);

    if (keys[0]) {
      parsedJson = protobufValueToPrimitive(
        literalType.metadata.fields.definitions.structValue.fields[`${keys[0]}`],
      );

      if (parsedJson) {
        parsedJson = formatJson(parsedJson);
        jsonFormRenderable = true;
      }
    }

Specifically, the choice to load only the first definition. In my example, the LiteralsonlySchema is the second element in the definition fields, so is not loaded. It looks like this function should be re-written to load all definitions

@cosmicBboy cosmicBboy added this to the 1.4.0 milestone Feb 6, 2023
@cosmicBboy cosmicBboy added ui Admin console user interface and removed untriaged This issues has not yet been looked at by the Maintainers labels Feb 6, 2023
@cosmicBboy
Copy link
Contributor

@jsonporter to make two tickets from this:

  1. make sure the error in Case 1 is handled in some way
  2. longer term, support nested dataclasses

@alex-dr
Copy link
Author

alex-dr commented Feb 10, 2023

Thanks!

It might also be helpful to update the docs to mention that nested dataclass types aren't currently supported. That wasn't clear to me from the admonition The class should be a pure value class, which is terminology I can't find anywhere else on the Internet.

For what it's worth, for my use case I've just changed every nested dataclass I'm passing around in Flyte to to dict's, and handle conversion from dicts to dataclasses and back within the tasks:

@task 
def my_task(param: dict) -> dict:
    param = ParamClass.from_dict(param)
    # do stuff
    return result.to_dict()

@cosmicBboy
Copy link
Contributor

cool, yeah let's clean that up. Would you mind opening up a docs issue for that @alex-dr ? https://github.com/flyteorg/flyte/issues/new?assignees=&labels=documentation%2Cuntriaged&template=docs_issue.yaml&title=%5BDocs%5D+

@4nalog
Copy link
Member

4nalog commented Mar 7, 2023

Hey @cosmicBboy @alex-dr,

This is what I am seeing in terms of JSON schema representation for the task:

{
    "metadata": {
        "fields": {
            "$ref": {
                "stringValue": "#/definitions/DataclassesonlySchema"
            },
            "definitions": {
                "structValue": {
                    "fields": {
                        "LiteralsonlySchema": {
                            "structValue": {
                                "fields": {
                                    "type": {
                                        "stringValue": "object"
                                    },
                                    "additionalProperties": {
                                        "boolValue": false
                                    },
                                    "properties": {
                                        "structValue": {
                                            "fields": {
                                                "foo": {
                                                    "structValue": {
                                                        "fields": {
                                                            "type": {
                                                                "stringValue": "string"
                                                            },
                                                            "title": {
                                                                "stringValue": "foo"
                                                            }
                                                        }
                                                    }
                                                },
                                                "bar": {
                                                    "structValue": {
                                                        "fields": {
                                                            "title": {
                                                                "stringValue": "bar"
                                                            },
                                                            "type": {
                                                                "stringValue": "integer"
                                                            }
                                                        }
                                                    }
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        },
                        "DataclassesonlySchema": {
                            "structValue": {
                                "fields": {
                                    "additionalProperties": {
                                        "boolValue": false
                                    },
                                    "properties": {
                                        "structValue": {
                                            "fields": {
                                                "second": {
                                                    "structValue": {
                                                        "fields": {
                                                            "$ref": {
                                                                "stringValue": "#/definitions/LiteralsonlySchema"
                                                            },
                                                            "type": {
                                                                "stringValue": "object"
                                                            },
                                                            "field_many": {
                                                                "boolValue": false
                                                            }
                                                        }
                                                    }
                                                },
                                                "first": {
                                                    "structValue": {
                                                        "fields": {
                                                            "type": {
                                                                "stringValue": "object"
                                                            },
                                                            "field_many": {
                                                                "boolValue": false
                                                            },
                                                            "$ref": {
                                                                "stringValue": "#/definitions/LiteralsonlySchema"
                                                            }
                                                        }
                                                    }
                                                }
                                            }
                                        }
                                    },
                                    "type": {
                                        "stringValue": "object"
                                    }
                                }
                            }
                        }
                    }
                }
            },
            "$schema": {
                "stringValue": "http://json-schema.org/draft-07/schema#"
            }
        }
    },
    "simple": 9
}

I am not sure what is the best solution to this is though I thought I had an initial fix. Looking at the codebase I was able to figure out why the error was arbitrary. Taking the above JSON schema and implementation of the parsedJSON function:

const keys = Object.keys(
      literalType?.metadata?.fields?.definitions?.structValue?.fields,
    );

    if (keys[0]) {
      parsedJson = protobufValueToPrimitive(
        literalType.metadata.fields.definitions.structValue.fields[
          `${keys[0]}`
        ],
      );

I saw two oddities initially:

  1. The fact that we only handle keys[0]. In the above example, as evident, we have two keys one for DataclassesonlySchema and other for LiteralsonlySchema
  2. The arbitrary nature of error is to the fact that Object.keys is not ordered. Meaning the value at keys[0] could either be DataclassesonlySchema or LiteralsonlySchema randomly.

Had two follow up questions:

  1. Should we be generating the complete JSON structure based on the refs on the frontend. That way we would get some representation like:
DataclassesonlySchema {
    first: {
      foo:
      bar:
    }
    second: {
      foo:
      bar:
    }
}

or

  1. Is our intension to not support nested objects at the moment and suggest an alternative to show use a dict instead as mentioned above. In this case, we could detect on Object.keys.length which if greater than zero can show a helpful error message that points the user to use dict for achieving the nested behaviour

@cosmicBboy
Copy link
Contributor

Just to capture the discussion here, we're going to implement a 2-stage solution:

  1. Quick-fix: detect whether the jsonschema has more than one custom dataclass definition. If so, then give the user a textarea field. They're responsible for inputting the correct format
    • we can do some basic json format validation, but we can't check fields and types at this level. If there's an error, this will show up as a runtime error in the execution
  2. Long-term fix: (should create a new issue for this) a richer launchform UI that renders all nested fields, types, has type validation, etc @jsonporter @4nalog , will need to bring in some design work for this

@alex-dr
Copy link
Author

alex-dr commented Mar 7, 2023

The quick fix sounds great for me, and look forward to the long-term one. Thanks all!

@cosmicBboy cosmicBboy modified the milestones: 1.5.0, 1.6.0 Apr 20, 2023
Copy link

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable.
Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot added the stale label Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale ui Admin console user interface
Projects
None yet
Development

No branches or pull requests

4 participants