Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up apply! code #25

Merged
merged 7 commits into from
Feb 24, 2021
Merged

Clean up apply! code #25

merged 7 commits into from
Feb 24, 2021

Conversation

glennmoy
Copy link
Member

Closes #18

There is an unfortunate conflict in how the dims kwarg is implemented in eachslice vs mapslices #18

mapslices is more intuitive to our researchers but it cannot be used to mutate data directly because it doesn't take a view of the underlying data (there is no such thing as a mapview JuliaLang/julia#29146)

This is important for having a mutating apply! method, which is now only possible with eachslices + some workarounds.

This PR attempts to simplify this interface by just delegating to apply inside apply! and avoiding the complications involved in using eachslice in the first place. (The only place it is used in now is LinearCombination).

But since apply! delegates to apply which delegates to _apply, there is also no longer any need for the _apply! method.
So now users only need to implement one method (_apply) and Transforms takes care of the rest.

With this change I think the code is much simpler to follow and more "honest" in what it is using under-the-hood (mapslices).
However, I haven't taken steps to reduce allocations and it's probably 2x more expensive on that front but we can probably make those improvements as well.

@codecov
Copy link

codecov bot commented Feb 22, 2021

Codecov Report

Merging #25 (6d438f1) into main (a814357) will decrease coverage by 4.22%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #25      +/-   ##
==========================================
- Coverage   99.22%   95.00%   -4.23%     
==========================================
  Files           9        9              
  Lines         129      100      -29     
==========================================
- Hits          128       95      -33     
- Misses          1        5       +4     
Impacted Files Coverage Δ
src/periodic.jl 100.00% <ø> (ø)
src/power.jl 100.00% <ø> (ø)
src/temporal.jl 100.00% <ø> (ø)
src/utils.jl 0.00% <ø> (-88.89%) ⬇️
src/one_hot_encoding.jl 100.00% <100.00%> (ø)
src/scaling.jl 96.15% <100.00%> (-3.85%) ⬇️
src/transformers.jl 100.00% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a814357...6d438f1. Read the comment docs.

@nicoleepp
Copy link
Contributor

Probably worth going through the transforms and deleting the unnecessary _apply! methods

@glennmoy
Copy link
Member Author

Probably worth going through the transforms and deleting the unnecessary _apply! methods

yeah I think I got them all 👍

Copy link
Contributor

@bencottier bencottier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a better way forward, because of the design issues it avoids.

to each row.
"""
function apply!(A::AbstractArray, t::Transform; dims=:, kwargs...)
A[:] = apply(A, t; dims=dims, kwargs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was true before, but I noticed that A[:] = doesn't work for UnitRange. I think that's to be expected (even on user end) because UnitRange is immutable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, if A = 1:3 this kind of assignment won't work?

Copy link
Contributor

@bencottier bencottier Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, simply because:

julia> x = 1:3
1:3

julia> x isa AbstractArray
true

julia> x[:] = 4:6
ERROR: setindex! not defined for UnitRange{Int64}
Stacktrace:
 [1] error(::String, ::Type{T} where T) at ./error.jl:42
 [2] error_if_canonical_setindex(::IndexLinear, ::UnitRange{Int64}, ::Int64) at ./abstractarray.jl:1161
 [3] setindex! at ./abstractarray.jl:1152 [inlined]
 [4] macro expansion at ./multidimensional.jl:802 [inlined]
 [5] macro expansion at ./cartesian.jl:64 [inlined]
 [6] macro expansion at ./multidimensional.jl:797 [inlined]
 [7] _unsafe_setindex! at ./multidimensional.jl:789 [inlined]
 [8] _setindex! at ./multidimensional.jl:785 [inlined]
 [9] setindex!(::UnitRange{Int64}, ::UnitRange{Int64}, ::Function) at ./abstractarray.jl:1153
 [10] top-level scope at REPL[39]:1

Also Floats in case you're wondering:

julia> x = 1.:3.
1.0:1.0:3.0

julia> x isa AbstractArray
true

julia> x[:] = 4.:6.
ERROR: setindex! not defined for StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
Stacktrace:
 [1] error(::String, ::Type{T} where T) at ./error.jl:42
 [2] error_if_canonical_setindex(::IndexLinear, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::Int64) at ./abstractarray.jl:1161
 [3] setindex! at ./abstractarray.jl:1152 [inlined]
 [4] macro expansion at ./multidimensional.jl:802 [inlined]
 [5] macro expansion at ./cartesian.jl:64 [inlined]
 [6] macro expansion at ./multidimensional.jl:797 [inlined]
 [7] _unsafe_setindex! at ./multidimensional.jl:789 [inlined]
 [8] _setindex! at ./multidimensional.jl:785 [inlined]
 [9] setindex!(::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::Function) at ./abstractarray.jl:1153
 [10] top-level scope at REPL[43]:1

@@ -43,11 +43,6 @@ function _apply(x, P::Periodic{T}; kwargs...) where T <: Period
map(xi -> _periodic(P.f, xi, P.period, P.phase_shift), x)
end

function _apply!(x::AbstractArray{T}, P::Periodic; kwargs...) where T <: Real
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is informative benchmarking, but I was curious about the effect.

This branch:

julia> using BenchmarkTools, Transforms

julia> x = collect(1.:1_000_000.);

julia> p = Periodic(sin, 7);

julia> @benchmark Transforms.apply!(x, p)
BenchmarkTools.Trial: 
  memory estimate:  22.89 MiB
  allocs estimate:  9
  --------------
  minimum time:     6.602 ms (0.00% GC)
  median time:      10.553 ms (19.06% GC)
  mean time:        10.830 ms (33.86% GC)
  maximum time:     20.315 ms (63.28% GC)
  --------------
  samples:          462
  evals/sample:     1

main branch:

julia> @benchmark Transforms.apply!(x, p)
BenchmarkTools.Trial: 
  memory estimate:  22.89 MiB
  allocs estimate:  7
  --------------
  minimum time:     6.271 ms (0.00% GC)
  median time:      10.960 ms (17.62% GC)
  mean time:        10.793 ms (35.26% GC)
  maximum time:     21.596 ms (59.10% GC)
  --------------
  samples:          464
  evals/sample:     1

Why are there more allocations? What are approaches to reducing allocations? I'm not familiar with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there more allocations? What are approaches to reducing allocations? I'm not familiar with it.

I'm not sure either, I gotta look into it.

One thing to note however, is that when using BenchmarkTools I think you have to interpolate the function args as this can affect the result https://github.com/JuliaCI/BenchmarkTools.jl#quick-start (note: this doesn't apply to @time)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to note that with @views this goes back to 7

julia> @benchmark Transforms.apply!($x, $p)
BenchmarkTools.Trial: 
  memory estimate:  22.89 MiB
  allocs estimate:  7
  --------------
  minimum time:     5.029 ms (0.00% GC)
  median time:      6.940 ms (16.92% GC)
  mean time:        6.427 ms (19.47% GC)
  maximum time:     9.906 ms (32.37% GC)
  --------------
  samples:          778
  evals/sample:     1

@glennmoy
Copy link
Member Author

glennmoy commented Feb 23, 2021

Some basic performance checks after implementing @views

Note that now apply and apply! are equally performant and are roughly the average of what they each were before (where apply got better and apply! got worse).

using Transforms
using Random
Random.seed!(1)

X = rand(1.0:10.0, 100, 100)

transforms = (
    Periodic(sin, 7),
    Power(3),
    MeanStdScaling(X; dims=1),
    LinearCombination(ones(100)),
    OneHotEncoding(1:10),
)

for t in transforms
    @time Transforms.apply(X, t; dims=1);
    @time Transforms.apply(X, t; dims=1);

    try
        @time Transforms.apply!(X, t; dims=1);
        X = rand(1.0:10.0, 100, 100)
        @time Transforms.apply!(X, t; dims=1);
    catch e
    end
end

"""
This branch:

Periodic(sin, 7)
apply
0.780758 seconds (2.84 M allocations: 146.639 MiB, 5.11% gc time)
0.000626 seconds (1.64 k allocations: 380.922 KiB)
apply!
0.027169 seconds (71.10 k allocations: 4.024 MiB)
0.000667 seconds (1.64 k allocations: 380.922 KiB)

Power(3)
apply
0.160279 seconds (347.52 k allocations: 18.862 MiB)
0.000445 seconds (1.74 k allocations: 221.500 KiB)
apply!
0.007062 seconds (5.40 k allocations: 417.305 KiB)
0.000387 seconds (1.74 k allocations: 221.500 KiB)

MeanStdScaling(X; dims=1)
apply
0.184251 seconds (419.53 k allocations: 22.622 MiB, 5.67% gc time)
0.000619 seconds (2.04 k allocations: 234.000 KiB)
apply!
0.007365 seconds (5.70 k allocations: 430.008 KiB)
0.000334 seconds (2.04 k allocations: 234.000 KiB)

LinearCombination(ones(100))
apply
0.329633 seconds (1.26 M allocations: 64.869 MiB, 3.28% gc time)
0.000814 seconds (30.61 k allocations: 812.000 KiB)

OneHotEncoding
apply
0.021401 seconds (31.42 k allocations: 1.939 MiB)
0.000913 seconds (11.98 k allocations: 968.531 KiB)
"""

"""
Main:

Periodic(sin, 7)
apply
0.675031 seconds (2.59 M allocations: 133.186 MiB, 4.47% gc time)
0.000668 seconds (1.64 k allocations: 463.734 KiB)
apply!
0.342571 seconds (1.51 M allocations: 78.363 MiB, 6.02% gc time)
0.000292 seconds (1.21 k allocations: 300.781 KiB)

Power(3)
apply
0.153669 seconds (338.18 k allocations: 18.510 MiB)
0.000476 seconds (1.54 k allocations: 291.812 KiB)
apply!
0.075057 seconds (210.79 k allocations: 10.918 MiB)
0.000199 seconds (1.61 k allocations: 155.469 KiB)

MeanStdScaling(X; dims=1)
apply
0.214341 seconds (575.51 k allocations: 31.068 MiB, 4.94% gc time)
0.000420 seconds (2.04 k allocations: 390.250 KiB)
apply!
0.093857 seconds (278.64 k allocations: 14.865 MiB)
0.000199 seconds (1.91 k allocations: 167.969 KiB)

LinearCombination(ones(100))
apply
0.308854 seconds (1.04 M allocations: 53.751 MiB, 6.56% gc time)
0.001061 seconds (30.61 k allocations: 812.000 KiB)

OneHotEncoding
apply
0.020939 seconds (31.30 k allocations: 1.934 MiB)
0.000875 seconds (11.98 k allocations: 968.531 KiB)
"""

@@ -32,7 +32,7 @@ function _apply(x, encoding::OneHotEncoding; kwargs...)

results = zeros(Int, length(x), n_categories)

for (i, value) in enumerate(x)
@views for (i, value) in enumerate(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why @views at this level? Is it because it should be done whenever iterating the data, and this is the only internal _apply that does so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily every time but since we are indexing into results and assigning new values I figured using @views might help here. I could be wrong though.

@bencottier
Copy link
Contributor

Note that now apply and apply! are equally performant and are roughly the average of what they each were before (where apply got better and apply! got worse).

Hmm, true for the first timing, but for the second timing apply looks slightly worse on average. It's well within the same order of magnitude though, so it looks pretty good.

@glennmoy glennmoy changed the title POC: Clean up apply! code Clean up apply! code Feb 23, 2021
@glennmoy
Copy link
Member Author

Note that now apply and apply! are equally performant and are roughly the average of what they each were before (where apply got better and apply! got worse).

Hmm, true for the first timing, but for the second timing apply looks slightly worse on average. It's well within the same order of magnitude though, so it looks pretty good.

maybe I'm blind - can you point out where apply is doing worse in this branch?

@glennmoy glennmoy merged commit 9249440 into main Feb 24, 2021
@glennmoy glennmoy deleted the gm/apply!_cleanup branch March 9, 2021 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use a consistent convention for dims
3 participants