Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a consistent convention for dims #18

Closed
bencottier opened this issue Feb 12, 2021 · 20 comments · Fixed by #25
Closed

Use a consistent convention for dims #18

bencottier opened this issue Feb 12, 2021 · 20 comments · Fixed by #25
Labels
bug Something isn't working design

Comments

@bencottier
Copy link
Contributor

bencottier commented Feb 12, 2021

The apply method for AbstractArray in transformers.jl uses mapslices to apply a transform on each slice of the array, along a given dimension dims.

Meanwhile, the equivalent apply! method uses eachslice.

The problem is that mapslices and eachslice have opposite notions of dims. For example:

julia> M = [1 2 3; 4 5 6]
2×3 Array{Int64,2}:
 1  2  3
 4  5  6

julia> map(x -> println(x), eachslice(M; dims=1));
[1, 2, 3]
[4, 5, 6]

julia> mapslices(x -> println(x), M; dims=1);
[1, 4]
[2, 5]
[3, 6]

For higher dimensions, dims=3 in eachslice is equivalent to dims=[1, 2] in mapslices, for example (and note that eachslice only supports a single dimension in dims).

Note also that Statistics.mean and Statistics.std uses the same notion of dims as mapslices.

We should adopt a consistent convention for the meaning of dims, and explain this clearly in documentation.

@bencottier bencottier added the bug Something isn't working label Feb 16, 2021
@bencottier
Copy link
Contributor Author

bencottier commented Feb 17, 2021

mapslices does not seem to allow mutation.

julia> M = [1 2; 3 4];

julia> function add_one!(x)
           x[:] = x .+ 1
       end
add_one! (generic function with 1 method)

julia> mapslices(x -> add_one!(x), M; dims=1)
2×2 Array{Int64,2}:
 2  3
 4  5

julia> M
2×2 Array{Int64,2}:
 1  2
 3  4

Compare eachslice:

julia> map(x -> add_one!(x), eachslice(M; dims=2))
2-element Array{Array{Int64,1},1}:
 [2, 4]
 [3, 5]

julia> M
2×2 Array{Int64,2}:
 2  3
 4  5

@bencottier
Copy link
Contributor Author

bencottier commented Feb 17, 2021

SplitApplyCombine.jl splitdimsview could be good. It uses the same dims convention as eachslice but allows more than a single dimension. It seems to be faster than mapslices.

Discovered from this post

julia> using SplitApplyCombine

julia> f_ms(A) = mapslices(x -> sum(x), A; dims=(2, 3))
f_ms (generic function with 1 method)

julia> f_es(A) = map(x -> sum(x), eachslice(A; dims=1))
f_es (generic function with 1 method)

julia> f_sac(A) = map(x -> sum(x), splitdimsview(A, 1))
f_sac (generic function with 1 method)

julia> A = rand(150, 50, 100);

julia> isapprox(f_es(A), f_sac(A); atol=1e-15)
true

julia> isapprox(f_es(A), f_ms(A); atol=1e-12)
false

julia> isapprox(f_es(A), f_ms(A); atol=1e-11)  # quite a difference in precision...
true

julia> @btime f_es(A)
  765.021 μs (6 allocations: 1.45 KiB)

julia> @btime f_ms(A);
  1.640 ms (1988 allocations: 89.39 KiB)

julia> @btime f_sac(A);
  811.317 μs (1 allocation: 1.33 KiB)

also supports mutation:

julia> map(splitdimsview(M, 2)) do x
           add_one!(x)
       end
2-element Array{Array{Int64,1},1}:
 [2, 4]
 [3, 5]

julia> M
2×2 Array{Int64,2}:
 2  3
 4  5

EDIT: splitdimsview does not support named dims, while eachslice and mapslices do.

@bencottier
Copy link
Contributor Author

Note converting from one dims convention to another is pretty simple: setdiff(1:ndims(A), dims)

@bencottier
Copy link
Contributor Author

Overall, I'm favouring eachslice.

Upsides:

  • Faster
  • Supports named dims
  • Supports mutation

Downsides:

  • Opposite convention to Statistics.mean which we use elsewhere (but conversion should be straightforward)
  • Only supports slicing along a single dimension (but multiple dimensions seem like a rare use case)

@rofinn
Copy link
Member

rofinn commented Feb 17, 2021

Yeah, eachslice seems like the easiest path forward. I think if we want to add named dims support splitdimsview, we could probably do that in AxisKeys.jl and NamedDims.jl. That package seems pretty lightweight. That also seems like a relatively non-breaking change to introduce later since it'd just be adding support for multiple dims.

@nicoleepp
Copy link
Contributor

I did run into issues trying to use eachslice in the non mutating apply method, can't remember what they were so would be worth trying to change it again, or otherwise just converting the dims passed into it.

@bencottier
Copy link
Contributor Author

I did run into issues trying to use eachslice in the non mutating apply method, can't remember what they were so would be worth trying to change it again, or otherwise just converting the dims passed into it.

Yeah I'm getting different results with eachslice, stuff like this:

dims = 1: Test Failed at /Users/bencottier/JuliaEnvs/Transform.jl/test/temporal.jl:64
  Expression: Transforms.apply(M, hod; dims = d) == expected
   Evaluated: [[1, 9], [2, 10], [3, 11]] == [1 9; 2 10; 3 11]

@nicoleepp
Copy link
Contributor

I did run into issues trying to use eachslice in the non mutating apply method, can't remember what they were so would be worth trying to change it again, or otherwise just converting the dims passed into it.

Yeah I'm getting different results with eachslice, stuff like this:

dims = 1: Test Failed at /Users/bencottier/JuliaEnvs/Transform.jl/test/temporal.jl:64
  Expression: Transforms.apply(M, hod; dims = d) == expected
   Evaluated: [[1, 9], [2, 10], [3, 11]] == [1 9; 2 10; 3 11]

Different results are to be expected given this changes what dims things operate on. If that's the only issue shouldn't be a problem. But I do think mapslices conserves the type and shape (where possible) of what it operate on where eachslice doesn't, which is the reason we switched to mapslices in the first place

@rofinn
Copy link
Member

rofinn commented Feb 17, 2021

But I do think mapslices conserves the type and shape (where possible) of what it operate on where eachslice doesn't, which is the reason we switched to mapslices in the first place

If you're mutating the values of the slice shouldn't that preserve the dimensionality of the input array though? I haven't run into that issue with Impute.jl

@nicoleepp
Copy link
Contributor

But I do think mapslices conserves the type and shape (where possible) of what it operate on where eachslice doesn't, which is the reason we switched to mapslices in the first place

If you're mutating the values of the slice shouldn't that preserve the dimensionality of the input array though? I haven't run into that issue with Impute.jl

If you are mutating it does, if you are constructing a new type to return it doesn't but mapslices does preserve the shape when not mutating

@rofinn
Copy link
Member

rofinn commented Feb 17, 2021

preserve the shape when not mutating

Are there links to parts of the code where we can't do the mutating form? I was thinking that even for cases where the type needs to change, so the base API can't be mutating, that we'd still have enough info to pre-allocate the output? Then we'd just iterate over the slices of the inputs and outputs and mutate the pre-allocated outputs?

@nicoleepp
Copy link
Contributor

preserve the shape when not mutating

Are there links to parts of the code where we can't do the mutating form? I was thinking that even for cases where the type needs to change, so the base API can't be mutating, that we'd still have enough info to pre-allocate the output? Then we'd just iterate over the slices of the inputs and outputs and mutate the pre-allocated outputs?

https://github.com/invenia/Transforms.jl/blob/main/src/transformers.jl#L75

The output of _apply is not always 1-1, leading to issues if you expect to allocate it a previously created object

@bencottier
Copy link
Contributor Author

bencottier commented Feb 19, 2021

Polling result from asking 7 people in Research (3 ML, 1 PS, 1 DS, 2 RSE): all say the mapslices and mean convention is more intuitive/expected (note, in fairness, that the example I gave was about taking the mean, but I asked for their intuition without looking up code). A couple expressed pretty strong opinions that this was the right way: the dims that you specify are the dims that get reduced and removed (or turned into singletons) from the resulting array.

However, that raises an interesting point: consider

julia> M = NamedDimsArray{(:features, :observations)}([1 3; 5 7])
2×2 NamedDimsArray(::Array{Int64,2}, (:features, :observations)):
             observations
 features  1  3
            5  7

julia> mean(M; dims=2)
2×1 NamedDimsArray(::Array{Float64,2}, (:features, :observations)):
             observations
 features  2.0
            6.0

julia> mean(eachslice(M; dims=2))
2-element NamedDimsArray(::Array{Float64,1}, (:features,)):
 features  2.0
            6.0

julia> map(mean, eachslice(M; dims=2))
2-element Array{Float64,1}:
 3.0
 5.0

Notice that the convention using eachslice is consistent with mean if you pass it in directly. It's because mean(eachslice... takes the mean over the slices (combining them together), while the map case takes the mean of each slice, one at a time. It's like mapreduce vs. map.

This raises the question: can the dims convention legitimately differ, depending on whether or not the transform involves a reduce-style operation?

It could, but after the polling and discussion with people, I still think we should be consistent and use the mapslices/mean convention. Because you can still explain that in the way mapslices does:

Transform the given dimensions of array A using function f. f is called on each slice of A of the form A[...,:,...,:,...].
dims is an  integer vector specifying where the colons go in this expression. The results are concatenated along the
 remaining dimensions. For example, if dims is [1,2] and A is 4-dimensional, f is called on A[:,:,i,j] for all i and j.

If you accept all of that, a lingering problem is that the general apply! (from bc/scaling branch) won't be able to reduce one dimension at a time for >2D arrays, because:

  • (suppose) we follow the mapslices/mean convention for dims, so
  • If you want to reduce on one dimension, you would specify that dimension for dims, then
  • apply! would invert those dims to work for eachslice, turning the single dim into multiple, and then
  • eachslice would complain because it doesn't support multiple dims.

@rofinn
Copy link
Member

rofinn commented Feb 19, 2021

In retrospect, I'm not a fan of using eachslice, eachcol, eachrow, but then trying to match the conventions of mapslices. I feel like we should be transparent about which default operation we're performing behind the scenes rather than introducing our own hybrid behaviour. The reason I was a fan of the eachslice default in Impute.jl was that it's easier to map the behaviour of eachcol and eachrow to the corresponding default logic with tables. That being said, only filter was doing a reduction, so that use-case didn't really come up.

@nicoleepp
Copy link
Contributor

In retrospect, I'm not a fan of using eachslice, eachcol, eachrow, but then trying to match the conventions of mapslices. I feel like we should be transparent about which default operation we're performing behind the scenes rather than introducing our own hybrid behaviour.

Can you elaborate more on why you aren't a fan anymore?

@rofinn
Copy link
Member

rofinn commented Feb 19, 2021

Just that if our default behaviour is basically just running eachslice or mapslices behind the scenes then we should say so and have the dims behaviour be consistent with that. If our dims behaviour matches mapslices, but we're still passing SubArray to the specific implemented _apply! methods then we're just creating more confusion than is necessary. If folks want to overload our default behaviour it should be pretty clear what the default is based on knowledge of Julia more broadly.

NOTE: This might be a relevant discussion on the differences. JuliaLang/julia#29146

@glennmoy
Copy link
Member

glennmoy commented Feb 22, 2021

Polling result from asking 7 people in Research (3 ML, 1 PS, 1 DS, 2 RSE): all say the mapslices and mean convention is more intuitive/expected...A couple expressed pretty strong opinions that this was the right way: the dims that you specify are the dims that get reduced and removed (or turned into singletons) from the resulting array.

I think @bencottier's survey provides the strongest reason(s) for favouring mapslices.

It would also be interesting to check our currently implemented Transforms (which I will do later) for examples where dims is used in a context where reduction is not taking place and/or eachslice (when used with dims) is more intuitive to use.

I suspect there won't be many examples...

@glennmoy
Copy link
Member

EDIT: splitdimsview does not support named dims, while eachslice and mapslices do.

What happens? Can you open an issue in NamedDims.j and reference it here for later?

@bencottier
Copy link
Contributor Author

EDIT: splitdimsview does not support named dims, while eachslice and mapslices do.

What happens? Can you open an issue in NamedDims.j and reference it here for later?

invenia/NamedDims.jl#162

SplitApplyCombine.jl restricts dims to Int here.

julia> using SplitApplyCombine, NamedDims

julia> M = NamedDimsArray{(:features, :observations)}([1 3; 5 7])

julia> map(x -> sum(x), splitdimsview(M, :features))
ERROR: MethodError: no method matching splitdimsview(::NamedDimsArray{(:features, :observations),Int64,2,Array{Int64,2}}, ::Symbol)
Closest candidates are:
  splitdimsview(::AbstractArray) at /Users/bencottier/.julia/packages/SplitApplyCombine/RS5bI/src/splitdims.jl:127
  splitdimsview(::AbstractArray, ::Int64) at /Users/bencottier/.julia/packages/SplitApplyCombine/RS5bI/src/splitdims.jl:129
  splitdimsview(::AbstractArray{var"#s46",N} where var"#s46", ::Tuple{Vararg{Int64,M}}) where {N, M} at /Users/bencottier/.julia/packages/SplitApplyCombine/RS5bI/src/splitdims.jl:130
Stacktrace:
 [1] top-level scope at REPL[4]:1

julia> map(x -> sum(x), splitdimsview(M, 1))
2-element Array{Int64,1}:
  4
 12

julia> map(x -> sum(x), eachslice(M; dims=:features))
2-element Array{Int64,1}:
  4
 12

@nickrobinson251
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working design
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants