Stream: helpdesk (published)

Topic: `Array{Union{Missing, T}}` memory usage


view this post on Zulip James Wrigley (Aug 04 2022 at 14:47):

From the docs my understanding is that Arrays of optionally missing values actually store two arrays, one for the data and one to store whether or not the value is missing. If I've got that right, I don't understand why sizeof() says these are the same size in memory:

julia> x = rand(UInt16, 2, 2)
2×2 Matrix{UInt16}:
 61892   4489
 19311  27433

julia> y = Array{Union{Missing, UInt16}}(missing, 2, 2)
2×2 Matrix{Union{Missing, UInt16}}:
 missing  missing
 missing  missing

julia> sizeof(x)
8

julia> sizeof(y)
8

Shouldn't sizeof(y) == 12? 8 bytes for the data plus 4 bytes for the mask.

view this post on Zulip James Wrigley (Aug 04 2022 at 14:53):

Ah hah, turns out I should be using Base.summarysize(): https://discourse.julialang.org/t/sizeof-vector-union-missing-t-and-hypothetical-size/15271/14

julia> Base.summarysize(x)
48

julia> Base.summarysize(y)
52

Last updated: Oct 02 2023 at 04:34 UTC