Maybe someone can help me save a few minutes of benchmarking here.
I have a situation where I would like to pass a function fa
as an argument to another function fb
where fb
will call fa
in a hot loop and where fa
is determined at runtime.
Are there any performance gotchas here (I'm mainly thinking about the fact that Julia does not specialize on the function type by default)?
I can see the following options:
1) Just pass it as an argument
2) Pass it as an argument but qualify the type (ie. function fb(fa::F, ...) where F
)
3) Write a macro
I'm thinking 1) might be ok since functions like map
seem to do it, but I'm not sure here.
In case more context is needed: fb
is getting a pcap-reader which holds many captured packets. fb
loops over all packets in the pcap-reader. The pcap reader has a single header which tells me what protocol all the captured packets are, so my fa
will be things like process_ipv4_packet
, process_ethernet_packet
etc.
Julia does specialize on function types used as arguments if either
where
clause like this:function foo(f::F, xs) where F
ys = map(f, xs)
# do stuff with ys
end
- The argument function is used inside the function
Thanks, I somehow missed/forgot about this. I guess that makes the distinction between 1) and 2) pointless in my case (and I suppose the specialization also makes the macro pointless).
isn't the where
notation redundant if F
is not used in the body ?
e.g. wouldn't this be the same ?
function foo(f::Function, xs)
ys = map(f, xs)
# do stuff with ys
end
(also restricts arguments to be functions and not Any
)
Nope. As a latency optimisation, Julia's compiler doesn't specialize arguments whose only use in a function is to be passed on to another function. So in your foo
, foo
is not specialized on f
. However, this means the type of ys
in not inferred.
julia> function foo(f::Function, xs)
ys = map(f, xs)
y = 0
for i in ys
y += i
end
y
end
function bar(f::F, xs) where {F <: Function}
ys = map(f, xs)
y = 0
for i in ys
y += i
end
y
end
bar (generic function with 1 method)
julia> using BenchmarkTools
julia> v = collect(1:1000000);
julia> @btime foo(identity, v)
33.493 ms (2998951 allocations: 53.39 MiB)
500000500000
julia> @btime bar(identity, v)
578.063 μs (4 allocations: 7.63 MiB)
500000500000
Yeah this is a common gotcha.
Ahhh. yeah. now I start to remember again. It's also described here https://docs.julialang.org/en/v1/manual/performance-tips/#Be-aware-of-when-Julia-avoids-specializing
Is there something with global scope here which makes this seem like a worse issue than it is?
julia> @btime foo(identity, v)
63.639 ms (2998951 allocations: 53.39 MiB)
500000500000
julia> @btime foo(identity, $v)
1.717 ms (3 allocations: 7.63 MiB)
500000500000
julia> @btime bar(identity, $v)
1.710 ms (3 allocations: 7.63 MiB)
500000500000
julia> fa(x) = x
fa (generic function with 1 method)
julia> fb(x) = x
fb (generic function with 1 method)
julia> let
v = collect(1:100000)
f = rand((fa, fb))
@btime foo($f, $v)
@btime bar($f, $v)
end;
78.600 μs (3 allocations: 781.31 KiB)
78.000 μs (3 allocations: 781.31 KiB)
Hmmm, maybe not:
julia> function testfun(f1, v, fs...)
f = rand(fs)
f1(f, v)
end
testfun (generic function with 2 methods)
julia> @btime testfun($foo, $v, $fa, $fb)
56.751 ms (2998951 allocations: 53.39 MiB)
500000500000
julia> @btime testfun($bar, $v, $fa, $fb)
1.671 ms (4 allocations: 7.63 MiB)
500000500000
A bit scary that JET doesn't seem to detect the problem:
julia> using JET
julia> @report_opt testfun(foo, v, fa, fb)
No errors detected
julia> @report_opt foo(fa, v)
No errors detected
Just managed to speed up a different part of the code with about 20% thanks to this. I had assumed it was fine because JET didn't report any problems.
Make a bug report!
https://github.com/aviatesk/JET.jl/issues/697
DrChainsaw said:
Is there something with global scope here which makes this seem like a worse issue than it is?
What you're seeing here is a result of inlining. @btime foo(f, v)
creates an anonymous function () -> foo(f, v)
and times it. When v
is interpolated, foo
can be inlined, so the anonymous function is compiled as () -> <body of foo(f, v)>
, which doesn't have the specialization problem since f
is never actually passed as an argument to any function. Inlining does not happen when v
is not interpolated because then v
is an untyped global, so it's impossible to resolve which method of foo
to inline at compile time. (Well, actually, in this case it's not impossible because foo
only has a single method, but automatic inlining doesn't use that kind of world-splitting method resolution; it breaks function barriers, which is usually detrimental to performance.)
Here's one way to get around this in your timings without having to avoid interpolation:
julia> foowrap(f, xs) = @noinline foo(f, xs)
foowrap (generic function with 1 method)
julia> barwrap(f, xs) = @noinline bar(f, xs)
barwrap (generic function with 1 method)
julia> @btime foowrap(identity, $v)
20.406 ms (2998951 allocations: 53.39 MiB)
500000500000
julia> @btime barwrap(identity, $v)
481.709 μs (2 allocations: 7.63 MiB)
500000500000
That said, the fact that inlining immediately fixes the problem is kind of the point. Avoided specialization is considered a worthwhile tradeoff because this is expected to happen in many/most real-world use cases, so performance usually won't suffer.
I'm not sure I agree that it works out like that in practice. I've been bitten by this often enough that I now reflexively add the specialization parameter every time I write a method that takes a function as an argument. But that's the idea.
Last updated: Apr 04 2025 at 04:42 UTC