-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance of creating an array from a range #4784
Comments
Thanks for filing the issue and trying out all these different variations! |
Can you post the actual output of your test calls? As I mentioned on the mailing list, for me, (At one point, I thought that python was a lot faster, but that was because I had put the arguments in the wrong order.) |
I get this on my laptop: julia> [ f(0,10000,0.1,1000) for f=[func1,func2,func3] ]
3-element Array{Any,1}:
0.689111
0.671948
0.309497
julia> versioninfo()
Julia Version 0.2.0-rc4
Commit a37b4d6 (2013-11-11 18:47 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.0.0)
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
LAPACK: libopenblas
LIBM: libopenlibm On the same machine, Python takes 0.091638 seconds, so there's definitely a discrepancy on some systems. |
On the julia.mit.edu Linux machine (the official benchmark system), on the other hand, I get this: >>> import time
>>> import numpy as np
>>> def func(a, b, inc, iter):
... t0 = time.time()
... for i in range(1000):
... A = np.arange(a, b, inc)
... print(time.time()-t0, " seconds elapsed")
...
>>> func(0,10000,0.1,1000)
(0.5556738376617432, ' seconds elapsed') and Julia: julia> [ f(0,10000,0.1,1000) for f=[func1,func2,func3] ]
3-element Array{Any,1}:
1.06376
1.06842
0.564819
julia> versioninfo()
Julia Version 0.2.0-rc4+2
Commit 249fdca (2013-11-11 20:53 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
LAPACK: libopenblas
LIBM: libopenlibm So this may be an OS X vs. Linux thing to some extent. @sungpily, @skariel – what operating systems are you on? |
My results seem very consistent with those from Stefan. I am using Windows 7 machine. |
By the way, thanks for reformatting the original post. I could not figure out how. |
IMHO, func3 looks "hackish" even though it performs better. I will be happy to see func1 and func2 works as fast as python or matlab. |
Yeah, the point is not that people should use that version. We're trying to figure out what versions are faster or slower and why so that they can all be faster. |
This version is orders of magnitude faster than func3 above: function func6(a, inc, b)
A = Array(Float64, length(a:inc:b))
for i=1:1000
ix=1
while a<b
A[ix] = a
ix+=1
a+=inc
end
end
end If you allocate each iteration then it is still significantly faster than func3 above. I also compared to C++, code is here. I'm on Win7, using Julia-RC3-amd64, Python3.3 and latest Numpy (all amd64) A couple of interesting observations:
So this seems like a GC issue to me, where Julia is using colder memory (since GC not always has time to clear recently used memory - so allocation gives fresh memory) The only fix I think is appropriate is to optimize the range notation [a:inc:b] to use an algorithm like func6 above. The hot/cold memory is a non-issue really, since if you need hot memory then you just don't de-allocate it. About using integer types and changing them to Float64 - I deleted this post from the forums like 1 minute after writing it ssince I timed the wrong function. It still got mailed though.. sorry for that |
No worries, and thanks for your tests! |
Yes, thanks for the excellent analysis. Now we just need to make it do the fast thing :-) |
@skariel, in |
Oh man.... This is embarrassing.... I'll just maintain a low profile for the newest few months.... :( |
@skariel, please don't! We need people who care about performance and have the ability to do the deep digging. I remember that one of my earliest questions to the Julia community was "I see there's an Hope to see you around a lot more! |
I'll second that--I really appreciate that you're taking the time to explore this so deeply. Kevin |
Here's an update on this issue with a gist for all my code.
A few other meta-thoughts:
|
I think that we'll have to settle for matching single-threaded performance of other systems for now – so single-threaded C is the benchmark to compare against. Once we have threading in Julia, we can revisit this and make sure that we're as fast as other systems that use threads to speed this up. |
Small update on this. Seems we are doing a bit better now. Here are the times I get for func2 and func3 in 3 versions (func1 and func2 are identical):
|
|
func1 and func2 need a trailing |
Yeah, we're still a factor of 4 slower than numpy with func3 (in-place) and a factor of 7 with func2 on my machine. It's worth noting, though, that our float ranges are doing a whole lot more work (by default) than Numpy's… and this is a deliberate decision and trade-off. If we use a lower precision range type, we're doing much better:
The Python definition on my machine takes 0.144 seconds. |
I think things improved since this issue was opened. Here the results in my laptop: function func1(a, b, inc, iter)
for i=1:iter
A = [a:inc:b]
end
end
function func2(a, b, inc, iter)
A = Array{Float64}(undef, length(a:inc:b))
for i=1:iter
A = [a:inc:b]
end
end
function func3(a, b, inc, iter)
A = Array{Float64}(undef, length(a:inc:b))
for i=1:iter
A[:] = a:inc:b
end
end
function func4(a, b, inc, iter)
A = Array{Float64}(undef, length(a:inc:b))
for i=1:iter
A = [StepRangeLen(a,inc,b*10+1);]
end
end
function func5(a, b, inc, iter)
A = Array{Float64}(undef, length(a:inc:b))
for i=1:iter
A[:] = StepRangeLen(a,inc,b*10+1)
end
end
@time func1(0,10000,.1,1000)
@time func2(0,10000,.1,1000)
@time func3(0,10000,.1,1000)
@time func4(0,10000,.1,1000)
@time func5(0,10000,.1,1000)
println("")
@time func1(0,10000,.1,1000)
@time func2(0,10000,.1,1000)
@time func3(0,10000,.1,1000)
@time func4(0,10000,.1,1000)
@time func5(0,10000,.1,1000) $ julia a.jl
0.119092 seconds (397.67 k allocations: 21.202 MiB)
0.009710 seconds (18.41 k allocations: 1.809 MiB)
0.360205 seconds (90.05 k allocations: 5.279 MiB)
0.236766 seconds (96.16 k allocations: 768.615 MiB, 8.72% gc time)
0.136129 seconds (83.31 k allocations: 4.851 MiB)
0.000187 seconds (1.00 k allocations: 125.156 KiB)
0.000293 seconds (1.01 k allocations: 906.547 KiB)
0.332319 seconds (6 allocations: 781.547 KiB)
0.272638 seconds (2.01 k allocations: 763.840 MiB, 10.49% gc time)
0.104920 seconds (6 allocations: 781.547 KiB) |
What Julia means by In my tests, Julia 1.3-dev is still at roughly the same speed that version 0.6 had. Here's a slight update to the tests to use BenchmarkTools — I think there's also some adversarial compiler shenanigans going on in both languages. julia> using BenchmarkTools
julia> a, b, inc = 0,10000,.1;
julia> A = Array{Float64}(undef, length(a:inc:b));
julia> @btime [$a:$inc:$b;];
260.461 μs (2 allocations: 781.39 KiB)
julia> @btime $A[:] = $a:$inc:$b;
287.677 μs (0 allocations: 0 bytes)
julia> @btime $A .= $a:$inc:$b;
253.028 μs (0 allocations: 0 bytes)
julia> @btime [StepRangeLen($a,$inc,$b*10+1);];
84.170 μs (2 allocations: 781.39 KiB)
julia> @btime $A[:] = StepRangeLen($a,$inc,$b*10+1);
112.257 μs (0 allocations: 0 bytes)
julia> @btime $A .= StepRangeLen($a,$inc,$b*10+1);
26.295 μs (0 allocations: 0 bytes) Compare this to numpy: julia> using PyCall
julia> @pyimport numpy
julia> @btime pycall(numpy.arange, PyObject, $a, $b, $inc);
86.354 μs (9 allocations: 192 bytes) So we're around a factor of 3-4 away for the colon construction, but the |
Since we intentionally use higher precision for float ranges, maybe this should be closed? |
I have tested the following three variations of array creation statement:
and their execution times with a=0, b=10000, inc=0.1, iter=1000
func1: 0.67 seconds
func2: 0.69 seconds
func3: 0.34 seconds
Equivalent python/numpy function
takes about 0.1 seconds and equivalent matlab function
takes about 0.1 seconds.
[jiahao - edit for formatting. Please use triple backquotes for posting code, otherwise it's unreadable.
also x-ref julia-users]
The text was updated successfully, but these errors were encountered: