Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix columntable materialization on stored schema #360

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Dec 3, 2024

Fixes #357.

The issue here is for stored schema, the type of the schema is Schema{nothing, nothing} which usually indicates tables with many columns. Some tables implementations, however, like ARFFFiles.jl, may choose to explicitly store all schemas, even for very narrow tables. We already have a generated branch which checks for a specialization threshold for the known-schema case, so the fix here is fairly straightforward in just actually checking if the stored schema # of columns is actually too many or not.

In the end, users should be aware that Tables.columntable isn't a perfect, 100% kind of table implementation that is always expected to work. It was originally meant as just a test implementation that then turned out to be fairly convenient for REPL use. Users should note that generating a named tuple of columns from stored schema doesn't have a way to be particularly efficient, since it necessarily has to generate the NamedTuple type at runtime.

Fixes #357.

The issue here is for stored schema, the type of the schema is
`Schema{nothing, nothing}` which usually indicates tables with
many columns. Some tables implementations, however, like ARFFFiles.jl,
may choose to explicitly store _all_ schemas, even for very narrow tables.
We already have a generated branch which checks for a specialization threshold
for the known-schema case, so the fix here is fairly straightforward in just
actually checking if the stored schema # of columns is actually too many or
not.

In the end, users should be aware that `Tables.columntable` isn't a perfect,
100% kind of table implementation that is always expected to work. It was originally
meant as just a test implementation that then turned out to be fairly convenient
for REPL use. Users should note that generating a named tuple of columns from stored
schema doesn't have a way to be particularly efficient, since it necessarily has to
generate the NamedTuple type at runtime.
@quinnj quinnj merged commit 14edc59 into main Dec 3, 2024
6 checks passed
@quinnj quinnj deleted the jq-columntable-stored-schema branch December 3, 2024 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected complaint about too-wide a table
1 participant