Fix columntable materialization on stored schema #360

quinnj · 2024-12-03T05:54:12Z

Fixes #357.

The issue here is for stored schema, the type of the schema is Schema{nothing, nothing} which usually indicates tables with many columns. Some tables implementations, however, like ARFFFiles.jl, may choose to explicitly store all schemas, even for very narrow tables. We already have a generated branch which checks for a specialization threshold for the known-schema case, so the fix here is fairly straightforward in just actually checking if the stored schema # of columns is actually too many or not.

In the end, users should be aware that Tables.columntable isn't a perfect, 100% kind of table implementation that is always expected to work. It was originally meant as just a test implementation that then turned out to be fairly convenient for REPL use. Users should note that generating a named tuple of columns from stored schema doesn't have a way to be particularly efficient, since it necessarily has to generate the NamedTuple type at runtime.

Fixes #357. The issue here is for stored schema, the type of the schema is `Schema{nothing, nothing}` which usually indicates tables with many columns. Some tables implementations, however, like ARFFFiles.jl, may choose to explicitly store _all_ schemas, even for very narrow tables. We already have a generated branch which checks for a specialization threshold for the known-schema case, so the fix here is fairly straightforward in just actually checking if the stored schema # of columns is actually too many or not. In the end, users should be aware that `Tables.columntable` isn't a perfect, 100% kind of table implementation that is always expected to work. It was originally meant as just a test implementation that then turned out to be fairly convenient for REPL use. Users should note that generating a named tuple of columns from stored schema doesn't have a way to be particularly efficient, since it necessarily has to generate the NamedTuple type at runtime.

quinnj mentioned this pull request Dec 3, 2024

Unexpected complaint about too-wide a table #357

Closed

quinnj merged commit 14edc59 into main Dec 3, 2024
6 checks passed

quinnj deleted the jq-columntable-stored-schema branch December 3, 2024 06:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix columntable materialization on stored schema #360

Fix columntable materialization on stored schema #360

quinnj commented Dec 3, 2024

Fix columntable materialization on stored schema #360

Fix columntable materialization on stored schema #360

Conversation

quinnj commented Dec 3, 2024