Stream: helpdesk (published)

Topic: Exterminating a strange heap allocation


view this post on Zulip Timothy (Oct 01 2023 at 09:14):

This is a cross-post from #performance-helpdesk on slack, but I'm tantalisingly close to having implemented a zero-heap-allocation function. There's just one bit which is allocating, and I'd dearly like to get rid of it.
With --track-allocation=use I see:

  - function turboshake(output::Type, # <:Unsigned or NTuple{n, <:Unsigned}
  -                     message::AbstractVector{<:Union{UInt64, UInt32, UInt16, UInt8}},
  -                     delimsufix::UInt8=0x80, capacity::Val = Val{CAPACITY}())
  0     state, finalblk = ingest(EMPTY_STATE, capacity, message)
  0     state = pad(state, capacity, finalblk, delimsufix)
720     squeeze(output, state, capacity)
  - end

Analysing a turboshake(UInt128, ...) call with Cthulu and PProf I see that this is fully-inferred, and for some reason a the output UInt128 is heap-allocated instead of stack-allocated? There's also a NTuple{25, UInt64} (which would correspond to state) that I also expect should be on the stack.

cthulhu.png
pprof.png

If I can interest anybody in taking a look, you can see the rest of the code here: https://github.com/tecosaur/KangarooTwelve.jl/blob/30f178a0b43be89ed68c23746c529b2b637ab51a/src/KangarooTwelve.jl


Last updated: Oct 02 2023 at 04:34 UTC