Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent of mapslices with views? #29146

Closed
piever opened this issue Sep 12, 2018 · 16 comments
Closed

Equivalent of mapslices with views? #29146

piever opened this issue Sep 12, 2018 · 16 comments
Labels
arrays [a, r, r, a, y, s]

Comments

@piever
Copy link
Contributor

piever commented Sep 12, 2018

After some discussion on discourse it was proposed to "feature request" a version of mapslices in Julia Base that does not allocate slices. Naively I would think the behavior and syntax should be identical to mapslices except that it iterates views rather than slices of the original array (mapviews is a tentative name). I think it'd be useful for all cases where one wants to apply mapslices with a function with no side effects (or that doesn't mutate the original array) and avoid the extra allocations. A typical use case is for example summing with some filtering condition along some dimension:

mapviews(v -> sum(Iterators.filter(!isnan, v)), M, dims=2)

for which I don't think there's an easy efficient solution in Base Julia right now.

It was pointed out that similar ideas have been implemented in the package JuliennedArrays and there was some discussion of moving some of that to Base, see this comment.

@mbauman
Copy link
Member

mbauman commented Sep 12, 2018

I'd still be a major fan of bringing JuliennedArrays into Base. The major blocker for me has been that it requires some oblique syntax to get type stability for julienne(array, (*, :)). If we could get it type stable with constant propagation — ideally constant propagation through keywords — then I'd say we spell this as map(f, slices(M, dims=2)). Or maybe eachslice.

@mbauman mbauman added the arrays [a, r, r, a, y, s] label Sep 12, 2018
@piever
Copy link
Contributor Author

piever commented Sep 12, 2018

I'm also not entirely satisfied about the JuliennedArrays syntax. I'm not expert on the technical side, but doesn't sum have the same issue? Meaning that sum(v, dims=d) output type depends on the keyword argument dims, so maybe the solution that was found for sum would also work for JuliennedArrays?

@mbauman
Copy link
Member

mbauman commented Sep 12, 2018

No, sum doesn't drop dimensions so while the algorithm performs very differently depending on dims it doesn't actually change the resulting dimensionality.

@piever
Copy link
Contributor Author

piever commented Sep 12, 2018

I see, sum only changes the dimensionality between sum(v) (returning a scalar) and sum(v, dims=d) (returning an AbstractArray with the dimensionality of v) so it's a less complex situation.

Do you think this needs to drop dimensions though? I don't have a strong preference either way but mapslices keeps a dimension of length 1, so it's a bit strange to have this inconsistency, Maybe there could be a keyword squeeze to decide what to do with those dimensions (if, of course, the associated technical difficulties don't make this unfeasible)?

@mbauman
Copy link
Member

mbauman commented Sep 12, 2018

The key is that exposing a fast first class slices (or julienne) iterator requires type-stable access to the elements (each slice), and those elements will have a different dimensionality (or SubArray type even if they don't drop dimensions).

@bramtayl
Copy link
Contributor

I don’t think it would be impossible to get ‘dims = 2’ to constant propagate. But I’ve somewhat given up on constant propagation; getting things into the type domain as soon as possible makes things SO much easier. If someone wants to give it a stab I’d welcome a PR.

@rapus95
Copy link
Contributor

rapus95 commented Jan 4, 2019

Whats the progress for this issue?

I wrote a very basic mutating mapviews!(...) for myself using views and would be curious how to improve its performance and what you guys think about it. Maybe it helps someone else aswell.

"""
    mapviews!(f, from, dest, dims)

Transform the given dimensions of array `from` and store to array `dest` using function `f`. `f` is called on each slice
of `from` resp. `dest` of the form `from[...,:,...,:,...]` resp. `dest[...,:,...,:,...]`. `dims` is an integer vector specifying where the
colons go in these expressions.
For example, if `dims` is `[1,2]`, `from` is 4-dimensional and `dest` is 4-dimensional aswell, `f` is called on `view(from,:,:,i,j)` and `view(dest,:,:,i,j)`
for all `i` and `j`.
"""
function mapviews!(f, from, dest, dims)
    allDims = 1:ndims(from)
    iterDims = setdiff(allDims, dims)
    for x in CartesianIndices(axes(from)[iterDims])
        x = [Tuple(x)...]
        sliceIdx = Tuple(if n in dims; (:) else popfirst!(x) end for n in allDims)
        f(view(from, sliceIdx...), view(dest, sliceIdx...))
    end
end

@mbauman
Copy link
Member

mbauman commented Jan 4, 2019

We now have #29749, so this can be written as map(f, eachslice(A; dims=...)). I think that's sufficient to close this, no?

@piever
Copy link
Contributor Author

piever commented Jan 4, 2019

Nice job: eachslice is extremely useful!

I think it almost closes this issue in that mapslices can take multiple dimensions (say mapslices(f, a, dims = (1,2))) whereas eachslice(a, dims = (1, 2)) errors. Still, feel free to close this and track the multi dimensional implementation separately.

@rapus95
Copy link
Contributor

rapus95 commented Jan 4, 2019

@mbauman does that also work for map!? Similar to the ability my function had. In order to prevent fragmentation/the need to copy around the arrays. Btw will it also cat the results according to the dimensions? (in case the function returns an array itself)

@mbauman
Copy link
Member

mbauman commented Jan 4, 2019

I mean, it does generate views, so yes, you with a mutating function f! you can do something like:

foreach(f!, eachslice(dest, dims=d), eachslice(from, dims=d))

You can also use map!, but that does something different — it stores the output as single elements in an array (and not slices thereof). And map will also allocate an array to store the results, so that's why I used foreach instead.

And of course neither map nor map! will do the same sort of fancy concatenation that mapslices does.

@dylanfesta
Copy link

dylanfesta commented Oct 22, 2019

Hi, I am not sure if this is the right place to post it, but I noticed an inconsistency in the meaning of the dims argument in the functions mapslices and eachslice.

Let's say I want to sum rows of an array...

julia> mat =[ 1 2;3 4;5 6]
3×2 Array{Int64,2}:
 1  2
 3  4
 5  6

julia> map(sum , eachslice(mat ; dims=1))
3-element Array{Int64,1}:
  3
  7
 11

However

julia> dropdims(mapslices(sum, mat ; dims=1); dims = 1 )
2-element Array{Int64,1}:
  9
 12

One should use the other dimension for mapslices

julia> dropdims(mapslices(sum, mat ; dims=2); dims = 2 )
3-element Array{Int64,1}:
  3
  7
 11

The documentation explains it: in mapslices dims refers to the position of the : in the array indexes, in eachslice it refers to the position that does not have the :. So they cannot be simply swapped in the code without accounting for this.

@hhaensel
Copy link

hhaensel commented Feb 5, 2020

I think @dylanfesta points to a very important difference between eachslice and mapslices. The syntax of the two is complementary; eachslice iterates along the given dimension and constructs views on all elements in the remaining dimensions; mapslices iterates over the remaining dimensions and has f act on the array of all elements of the given dimensions.
When handling 2-D arrays, one can easily use eachslices to produce the result, that @piever initially asked for. For higher-dimensional arrays this is not true.
So I think, the request for a non-allocating version of mapslices is still reasonable... ?
There is a factor of two for JuliennedArrays and mapslices:

A = rand(Float32, 1000, 1000, 10)
@btime map(mean, Slices(A, 2));     #  18.793 ms (10019 allocations: 664.92 KiB)
@btime mapslices(mean, A, dims=2);  #  37.548 ms (129599 allocations: 3.09 MiB)

see also this discussion.

EDIT:
It might be sufficient to improve mapslices to act on views?

@LilithHafner
Copy link
Member

eachslice now does this.

julia> M = rand(4,4,4);

julia> map(v -> sum(Iterators.filter(!isnan, v)), eachslice(M, dims=2))
4-element Vector{Float64}:
 7.087633515941685
 7.122552992022348
 8.84984800208923
 5.634331667457324

julia> map(v -> sum(Iterators.filter(!isnan, v)), eachslice(M, dims=(2,3)))
4×4 Matrix{Float64}:
 1.80646  1.36365  1.73135  2.18617
 1.86439  2.12595  2.42422  0.707997
 2.54281  2.76217  2.01137  1.53349
 1.42096  1.09299  0.91254  2.20785

julia> eachslice(M, dims=(2,3))[4,1]
4-element view(::Array{Float64, 3}, :, 4, 1) with eltype Float64:
 0.14685562032478716
 0.3136273513020742
 0.39280441511008457
 0.5676719922144797

@jacobusmmsmit
Copy link
Contributor

Apologies for commenting on a completed issue but, @LilithHafner, your example second example doesn't work for me:

julia> M = rand(4,4,4);

julia> map(v -> sum(Iterators.filter(!isnan, v)), eachslice(M, dims=(2,3)))
ERROR: ArgumentError: only single dimensions are supported
Stacktrace:
 [1] #eachslice#244
   @ ./abstractarraymath.jl:622 [inlined]
 [2] top-level scope

Any reason why there appears to be a regression?

@mcabbott
Copy link
Contributor

mcabbott commented Nov 2, 2022

This should work on nightly; the new eachslice which accepts dims::Tuple will be in 1.9.

Mapslices is also faster on 1.9, after #40996... although some cases that inferred when that was merged seem not to now?

julia> A = rand(Float32, 1000, 1000, 10);

julia> using BenchmarkTools, Statistics

julia> @btime map(mean, eachslice($A, dims=(1,3), drop=false)) |> size
  15.409 ms (9 allocations: 39.47 KiB)
(1000, 1, 10)

julia> @btime mapslices(mean, $A, dims=2) |> size
  13.128 ms (14915 allocations: 276.56 KiB)
(1000, 1, 10)

  19.864 ms (129595 allocations: 3.09 MiB)  # on Julia 1.8

julia> @btime mean($A; dims=2) |> size
  1.292 ms (14 allocations: 39.77 KiB)
(1000, 1, 10)

julia> VERSION
v"1.9.0-DEV.1694"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

9 participants