Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tuple/table ambiguity #182

Open
ablaom opened this issue Jan 26, 2022 · 3 comments
Open

tuple/table ambiguity #182

ablaom opened this issue Jan 26, 2022 · 3 comments

Comments

@ablaom
Copy link
Member

ablaom commented Jan 26, 2022

The scitype of a tuple is intended to be the Tuple of the element scitypes. For example:

julia> scitype((1.0, 4))
Tuple{Continuous, Count}

By this logic, if I create a 1-tuple with a table t as it's single element, then this tuple should have Tuple{scitype(t)}. But this isn't always the case:

t = (x=[1, 2], y=["a", "b"])

julia> scitype(t)
Table{Union{AbstractVector{Count}, AbstractVector{Textual}}}

julia> scitype((t,))
Table{Union{AbstractVector{AbstractVector{Count}}, AbstractVector{AbstractVector{Textual}}}}

The problem is that (t, ) is also a table (with one row):

julia> schema((t,))
┌───────┬─────────────────────────┬────────────────┐
│ names │ scitypes                │ types          │
├───────┼─────────────────────────┼────────────────┤
│ x     │ AbstractVector{Count}   │ Vector{Int64}  │
│ y     │ AbstractVector{Textual} │ Vector{String} │
└───────┴─────────────────────────┴────────────────┘

This is pretty awful 😢 . For example it makes it tricky, in MLJBase, to use the fit_data_scitype of models, to check compatibility of a model with data, as in JuliaAI/MLJBase.jl#731 . That is, the test scitype(data) <: fit_data_scitype(model) where data is the tuple of data arguments, is not reliable.

@ablaom
Copy link
Member Author

ablaom commented Jan 26, 2022

cc @pazzo83

@pazzo83
Copy link

pazzo83 commented Jan 26, 2022

Ah so this is why my tests were failing?

@ablaom
Copy link
Member Author

ablaom commented Jan 26, 2022

No, I now think that the MLJBase PR is (by accident?) actually avoiding this issue. See JuliaAI/MLJBase.jl#731 (comment) .

Still this issue could turn up unexpectedly elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants