-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Open
Labels
arrays[a, r, r, a, y, s][a, r, r, a, y, s]broadcastApplying a function over a collectionApplying a function over a collectionperformanceMust go fasterMust go faster
Description
See https://discourse.julialang.org/t/why-is-a-multi-argument-inplace-map-much-faster-in-this-case-than-a-broadcast/91525/6, the following seems broadly reproducible across a range of platforms:
julia> A = rand(10, 1000); B = copy(A); C = zero(A);
julia> using BenchmarkTools
julia> @btime map!(+, $C, $A, $B);
8.947 μs (0 allocations: 0 bytes)
julia> @btime $C .= $A .+ $B;
11.281 μs (0 allocations: 0 bytes)
julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × AMD EPYC 7742 64-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, znver2)
Threads: 1 on 64 virtual cores
Environment:
LD_LIBRARY_PATH = /home/user/lib:/lib::/home/user/.local/lib
JULIA_DEPOT_PATH = /scratch/user/julia
JULIA_REVISE_POLL = 1
JULIA_NUM_PRECOMPILE_TASKS = 1
JULIA_EDITOR = viThis difference goes away for nearly square matrices, and is minimal for tall matrices. On some platforms, broadcasting performs better for the tall and square cases. However, map! seems to consistently do better for the wide case.
Metadata
Metadata
Assignees
Labels
arrays[a, r, r, a, y, s][a, r, r, a, y, s]broadcastApplying a function over a collectionApplying a function over a collectionperformanceMust go fasterMust go faster