Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(schema): don't specialize type variable in MappingSchema #2394

Merged
merged 3 commits into from
Oct 10, 2023

Conversation

georgesittas
Copy link
Collaborator

In SQLMesh we use the MappingSchema to store DataTypes and it works just fine. Not sure why we restricted the type hint to just str for the types to begin with.

@georgesittas
Copy link
Collaborator Author

Hmm I guess this is breaking because mypy may yell if a current user of MappingSchema calls a string method on a type, respecting the previous type hints, since the string method won't exist on the DataType.

class MappingSchema(AbstractMappingSchema[t.Dict[str, str]], Schema):
class MappingSchema(AbstractMappingSchema, Schema):
Copy link
Collaborator Author

@georgesittas georgesittas Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @barakalon can you remind me why this was t.Dict[str, str] to begin with? It seems like it was first introduced in f50b2918b but I don't remember the context.

We clearly allow both str and DataType (e.g. we have code that considers these two cases for the columns' types, see get_column_type), so an alternative I considered was updating this type to be t.Dict[str, str | DataType], but this would then be breaking and the user would have to explicitly cast the result of AbstractMappingSchema.find, which is annoying.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no good reason. Schema pre-exists a lot of maturation of DataType.

removing the type parameter implicitly makes this AbstractMappingSchema[Any]. Is that what we want?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully sure; my reasoning was that being more strict by specifying the type as a union of str and DataType would be more annoying for users because they'd have to account for both cases. I'm also not sure if allowing str column types instead of converting them to DataTypes is helpful?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place that type parameter is used: https://github.com/tobymao/sqlglot/blob/main/sqlglot/schema.py#L165

and it seems like that’s meant to be a protected method. So maybe just remove the generic altogether? Might not be worth it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove it altogether too, I don't see much value in it if we don't specialize the type anyway.

it seems like that’s meant to be a protected method

Not sure about this one though, e.g. we're using find in SQLMesh as a way to get the columns-to-types mapping of a given table.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And do you coerce from string|DataType to DataType?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we store the column types as DataType values initially so it just returns them later on

@georgesittas georgesittas changed the title Fix(schema): allow MappingSchema types to be DataTypes Fix(schema): don't specialize type variable in MappingSchema Oct 10, 2023
@georgesittas georgesittas merged commit 8afa7a1 into main Oct 10, 2023
@georgesittas georgesittas deleted the jo/mapping_schema_type_hint_fix branch October 10, 2023 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants