This is a cross-post from #performance-helpdesk
on slack, but I'm tantalisingly close to having implemented a zero-heap-allocation function. There's just one bit which is allocating, and I'd dearly like to get rid of it.
With --track-allocation=use
I see:
- function turboshake(output::Type, # <:Unsigned or NTuple{n, <:Unsigned}
- message::AbstractVector{<:Union{UInt64, UInt32, UInt16, UInt8}},
- delimsufix::UInt8=0x80, capacity::Val = Val{CAPACITY}())
0 state, finalblk = ingest(EMPTY_STATE, capacity, message)
0 state = pad(state, capacity, finalblk, delimsufix)
720 squeeze(output, state, capacity)
- end
Analysing a turboshake(UInt128, ...)
call with Cthulu and PProf I see that this is fully-inferred, and for some reason a the output UInt128
is heap-allocated instead of stack-allocated? There's also a NTuple{25, UInt64}
(which would correspond to state
) that I also expect should be on the stack.
If I can interest anybody in taking a look, you can see the rest of the code here: https://github.com/tecosaur/KangarooTwelve.jl/blob/30f178a0b43be89ed68c23746c529b2b637ab51a/src/KangarooTwelve.jl
I did a bunch of fiddling with the code, and the alloc mysteriously disappeared?
Timothy has marked this topic as resolved.
Last updated: Dec 28 2024 at 04:38 UTC