-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add is_transformable * Add to documentation API * Update docs * Fix docs job * Rename transforms.jl to transform.jl
- Loading branch information
Showing
10 changed files
with
80 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# [Transform Interface](@id transform-interface) | ||
|
||
The idea around a "transform interface” is to make feature transformations composable, i.e. the output of one `Transform` should be valid input to another. | ||
|
||
Feature engineering pipelines, which comprise a sequence of multiple `Transform`s and other steps, should obey the same principle and one should be able to add/remove subsequent `Transform`s without the pipeline breaking. | ||
So the output of an end-to-end transform pipeline should itself be "transformable". | ||
|
||
We have enforced this in Transforms.jl by only supporting certain input types, i.e. AbstractArrays and Tables, which produce other AbstractArrays and Tables. | ||
We also have specified this in the `transform` function API, which is expected to be overloaded for implementing pipelines (the exact method is an implementation detail for the user). | ||
Our only requirement is that the return of the implemented `transform` is itself "transformable", i.e. an AbstractArray or Table. | ||
This can be checked by calling `is_transformable` on the output. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
|
||
""" | ||
Transform | ||
Abstract supertype for all feature Transforms. | ||
""" | ||
abstract type Transform end | ||
|
||
# Make Transforms callable types | ||
(t::Transform)(x; kwargs...) = apply(x, t; kwargs...) | ||
|
||
""" | ||
is_transformable(x) | ||
Determine if `x` is both a valid input and output of any [`Transform`](@ref), i.e. that it | ||
follows the [`transform`](@ref) interface. | ||
Currently, all subtypes of `Table`s and `AbstractArray`s are transformable. | ||
""" | ||
is_transformable(::AbstractArray) = true | ||
is_transformable(x) = Tables.istable(x) | ||
|
||
""" | ||
transform(::T, data) | ||
Defines the feature engineering pipeline for some type `T`, which comprises a collection of | ||
[`Transform`](@ref)s and other steps to be peformed on the `data`. | ||
The idea around a "transform interface” is to make feature transformations composable, i.e. | ||
the output of any one `Transform` should be valid input to another. | ||
Feature engineering pipelines should obey the same principle and it should be trivial to | ||
add/remove `Transform` steps that compose the pipeline without it breaking. | ||
`transform` should be overloaded for custom types `T` that require feature engineering. | ||
The only requirement is that the return of `transform `is itself "transformable", i.e. | ||
calling [`is_transformable`](@ref) on the output returns true. | ||
""" | ||
function transform end | ||
|
||
""" | ||
transform!(::T, data) | ||
Mutating version of [`transform`](@ref). | ||
""" | ||
function transform! end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
@testset "is_transformable" begin | ||
|
||
# Test that AbstractArrays and Tables are transformable | ||
@test is_transformable([1, 2, 3, 4, 5]) | ||
@test is_transformable([1 2 3; 4 5 6]) | ||
@test is_transformable(AxisArray([1 2 3; 4 5 6], foo=["a", "b"], bar=["x", "y", "z"])) | ||
@test is_transformable(KeyedArray([1 2 3; 4 5 6], foo=["a", "b"], bar=["x", "y", "z"])) | ||
@test is_transformable((a = [1, 2, 3], b = [4, 5, 6])) | ||
@test is_transformable(DataFrame(:a => [1, 2, 3], :b => [4, 5, 6])) | ||
|
||
# Test types that are not transformable | ||
@test is_transformable(1) == false | ||
@test is_transformable("string") == false | ||
@test is_transformable(true) == false | ||
@test is_transformable(Dict(2 => 3)) == false | ||
end |
ab6ab74
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register()
ab6ab74
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/32196
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: