Can a mutable struct containing a Symbol have a concrete memory layout? I noticed that different symbols have different sizes:
julia> sizeof(:t)
1
julia> sizeof(:output)
6
thus, a struct like
mutable struct A
s::Symbol
end
cannot have a (not sure about the terminology here) constant memory layout, right?
sure it can
in fact, it has to - otherwise it wouldn't be possible to have different A
where you can swap out the symbol
what ends up happening is that for your field a pointer is used internally
that way, the symbol can be swapped out while keeping the size of A
fixed
Symbol
in particular is a bit special though
julia> mutable struct A
s::Symbol
end
julia> sizeof(A)
8
julia> sizeof(Symbol)
ERROR: Type Symbol does not have a definite size.
Stacktrace:
[1] sizeof(x::Type)
@ Base ./essentials.jl:551
[2] top-level scope
@ REPL[4]:1
suffice it to say, since it does not have a definite size, using it in a mutable struct most likely ends up as a pointer
Ah, ok, so it is boxed (but type-stable). Anyway, if one needs an array of those and access those fields, that won't be great for performance, as there will be lots of memory accessess, right?
What if one uses the const s::Symbol
introduced in 1.8
?
it's not Box
ed per se, the pointer is just abstracted away
that's a good question :thinking:
Answering my own question: I don't see any difference:
julia> mutable struct B
const s::Symbol
end
v = [B(rand((:a,:b))) for _ in 1:1000]
@btime count(a -> a.s == :a, $v)
451.457 ns (0 allocations: 0 bytes)
485
julia> mutable struct A
s::Symbol
end
v = [A(rand((:a,:b))) for _ in 1:1000]
@btime count(a -> a.s == :a, $v)
451.325 ns (0 allocations: 0 bytes)
492
Counting isbits stuff is much faster though:
julia> v = [ rand(1:2) for _ in 1:10^3 ]
@btime count(x -> x == 1, $v)
55.400 ns (0 allocations: 0 bytes)
504
julia> v = [ rand('A':'B') for _ in 1:10^3 ]
@btime count(x -> x == 'A', $v)
67.148 ns (0 allocations: 0 bytes)
509
it'll probably still be a pointer, since keeping the size of A
constant is more important
the symbol can have different sizes after all
yes, because for mutable structs wrapping a symbol there's an additional indirection, since the array is an array of pointers as well
In surprised it's that slow. Since symbols are interned, isn't it just an integer comparison of the pointers?
Jakob Nybo Nissen said:
In surprised it's that slow. Since symbols are interned, isn't it just an integer comparison of the pointers?
I always assumed that was the case (that'd it'd just be a pointer comparison, and hence equivalent to comparing Int
s or @enum
s).
@Christopher Rackauckas , I guess this is another reason to change DiffEq retcodes, aside from :success
being a fail.
I wonder if all the time is being spent fetching out-of-cache stuff from the heap, and the symbol comparison is insignificant relative to the cache misses
I guess this is another reason to change DiffEq retcodes, aside from :success being a fail.
It's mostly a correctness thing. Mis-spelled:succcess
is too common :sad:
A bunch of BioJulia including TranscodingStreams also use symbol literals in performance sensitive code. Perhaps it should be replaced with an enum
Jakob Nybo Nissen said:
In surprised it's that slow. Since symbols are interned, isn't it just an integer comparison of the pointers?
it'll have trouble since the array of mutable structs containing a symbol ends up as an array of pointers to a pointer - two derefs kill cache coherence, would be my guess - and that probably also kills SIMD
Last updated: Nov 06 2024 at 04:40 UTC