You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current OneHotEncoding transform seems to always just return a NxP binary matrix despite knowing what the categories are and potentially being passed a type that would allow you to retain that info.
For example, if I have OneHotEncoding(Hour(0):Hour(1):Hour(23)) and then I pass a dataframe with an HoD column, I could easily see that transform returning a new dataframe with a column name for each hour. Similarly, if we took a function argument, we could do something like:
FeatureTransforms.ohe(Hour, data; dims=:time)
This would return a new KeyedArray or AxisArray that retains the category information for the p dimension, and also uses the dims argument to mean use dimension keys for KeyedArrays or AxisArrays.
The text was updated successfully, but these errors were encountered:
Can you explain what you had in mind for the function argument? I'm not sure I see how it translates to keeping the category information? Is it that the function determines what the category labels are?
Is it that the function determines what the category labels are?
Yes, so if I have an axis with datetimes I might use hour to lazily generate the categories. For AxisArrays and KeyedArrays we could then return n (:time/DateTime) x p (:category/Hour).
The current
OneHotEncoding
transform seems to always just return a NxP binary matrix despite knowing what the categories are and potentially being passed a type that would allow you to retain that info.For example, if I have
OneHotEncoding(Hour(0):Hour(1):Hour(23))
and then I pass a dataframe with anHoD
column, I could easily see that transform returning a new dataframe with a column name for each hour. Similarly, if we took a function argument, we could do something like:This would return a new
KeyedArray
orAxisArray
that retains the category information for thep
dimension, and also uses thedims
argument to mean use dimension keys forKeyedArray
s orAxisArrays
.The text was updated successfully, but these errors were encountered: