Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One hot encoding transform #7

Closed
glennmoy opened this issue Feb 1, 2021 · 1 comment · Fixed by #19 or #31
Closed

One hot encoding transform #7

glennmoy opened this issue Feb 1, 2021 · 1 comment · Fixed by #19 or #31
Assignees
Labels
new transform New transform request

Comments

@glennmoy
Copy link
Member

glennmoy commented Feb 1, 2021

A one-hot-encoding transform for categorical variables

MWE

x = ["foo", "bar", "baz"]

ohe = OneHotEncoding()

Transform.apply(x, ohe)

# output
3×3 Array{Int64,2}:
 1  0  0
 0  1  0
 0  0  1
@glennmoy glennmoy added the new transform New transform request label Feb 1, 2021
@nicoleepp nicoleepp self-assigned this Feb 12, 2021
@rofinn
Copy link
Member

rofinn commented Feb 22, 2021

FWIW, if we're gonna change the type maybe we should use Bool for space efficiency?

julia> sizeof(true)
1

julia> sizeof(1)
8

That being said, maybe we could parameterize the type such that you state what you want returned? That way if I want to construct a pipeline where I know I'm gonna be merging the output from this type into an flattened array of floats then I'll construct it accordingly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new transform New transform request
Projects
None yet
3 participants