Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor LinearCombination #47

Merged
merged 8 commits into from
Mar 11, 2021
Merged

Refactor LinearCombination #47

merged 8 commits into from
Mar 11, 2021

Conversation

glennmoy
Copy link
Member

@glennmoy glennmoy commented Mar 9, 2021

Follow up to #45

This PR aligns the convention for dims used by LinearCombination with that of the rest of the package.

It also decouples the tests to make them independent with some minor refactoring to the code (renaming x -> A) to make it more consistent with other files.

There is one other change implemented in 5220430, which might be controversial and/or out of scope but it came up in the refactoring so I'll put it here for comment.

The basic idea is that a reduction operation should reduce the dimension of the input by 1.
That is, an N-dimensional array input should be reduced to an (N-1)-dimensional output.

This really only gets interesting when the input is a vector and N=1. Intuitively, one expects the result of this to be a scalar. But Julia actually returns a 0-dimensional array, which as far as I understand are like magic scalars? As in, you can pass them to another function by reference but they also support array-arithmetic. See:

(I think this also makes this function type-stable since it always returns an Array?)

In any case, main currently returns a 1-d vector, which is definitely inconsistent given the reasoning above if you ask me.
The question is: do we trust Julia to return a 0-dim array, or do we want to override it and return a scalar explicitly?

background

Note that we want to emulate sum(A; dims=d) behaviour, that is, compute the linear combination and reduce over dimension d.

julia> M = reshape(1:9, 3, 3)
3×3 reshape(::UnitRange{Int64}, 3, 3) with eltype Int64:
 1  4  7
 2  5  8
 3  6  9

julia> sum(M; dims=1)
1×3 Array{Int64,2}:
 6  15  24

julia> sum(M; dims=2)
3×1 Array{Int64,2}:
 12
 15
 18

main currently gives the opposite result:

julia> lc = LinearCombination([1, 1, 1])
LinearCombination(Real[1, 1, 1])

julia> FeatureTransforms.apply(M, lc; dims=1)
3-element Array{Int64,1}:
 12
 15
 18

julia> FeatureTransforms.apply(M, lc; dims=2)
3-element Array{Int64,1}:
  6
 15
 24

this branch gives correct result:

julia> FeatureTransforms.apply(M, lc; dims=1)
3-element Array{Int64,1}:
  6
 15
 24

julia> FeatureTransforms.apply(M, lc; dims=2)
3-element Array{Int64,1}:
 12
 15
 18

The changes also makes this much more performant.
See previous benchmarks for setup:

# apply: this branch
0.446041 seconds (1.70 M allocations: 89.214 MiB, 6.05% gc time)
0.000210 seconds (912 allocations: 206.688 KiB)

# apply: main
0.339547 seconds (1.27 M allocations: 65.669 MiB, 8.31% gc time)
0.000913 seconds (30.61 k allocations: 812.016 KiB

@codecov
Copy link

codecov bot commented Mar 9, 2021

Codecov Report

Merging #47 (9de159f) into main (15de595) will decrease coverage by 0.28%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
- Coverage   94.79%   94.50%   -0.29%     
==========================================
  Files           9        9              
  Lines          96       91       -5     
==========================================
- Hits           91       86       -5     
  Misses          5        5              
Impacted Files Coverage Δ
src/linear_combination.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99efde3...9de159f. Read the comment docs.

Copy link
Contributor

@bencottier bencottier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Balance is restored - looks good. I don't mind the 0-dimensional array. I just think the default dimension should be flipped.

Should this also have a version bump?

src/linear_combination.jl Show resolved Hide resolved
src/linear_combination.jl Outdated Show resolved Hide resolved
src/linear_combination.jl Outdated Show resolved Hide resolved
@glennmoy
Copy link
Member Author

Balance is restored - looks good. I don't mind the 0-dimensional array. I just think the default dimension should be flipped.

Should this also have a version bump?

yes, I'm tempted to say patch as this is fixing an inherent inconsistency we should have solved, that includes the 0-dim change.

@bencottier
Copy link
Contributor

bencottier commented Mar 10, 2021

Balance is restored - looks good. I don't mind the 0-dimensional array. I just think the default dimension should be flipped.

Should this also have a version bump?

yes, I'm tempted to say patch as this is fixing an inherent inconsistency we should have solved, that includes the 0-dim change.

Yeah overall I think patch is justified, given the previous related PR and the main docs describing what dims means without exception. Though the old docstring for LinearCombination was consistent with the old implementation.

@glennmoy glennmoy changed the title WIP: Refactor LinearCombination to align with dims convention Refactor LinearCombination Mar 10, 2021
Project.toml Show resolved Hide resolved
@glennmoy glennmoy merged commit 3817e0f into main Mar 11, 2021
@glennmoy glennmoy deleted the gm/refactor_linear_comb branch June 21, 2021 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants