Stream: helpdesk (published)

Topic: Convenient initialisation of vector of `NamedTuple`s


view this post on Zulip Nils (Jan 23 2024 at 10:08):

I'm porting some C++ code to Julia where there's a lot of this pattern:

List out(num_folds);
for (int k = 0; k<num_folds; k++){
(...)
List tmp_uv = initialize_uv(...);
List fold_k = List::create(Named("u") = tmp_uv["u"],
                               Named("v") = tmp_uv["v"],
                               Named("lambda_L_max") = tmp_uv["lambda_L_max"],
                               Named("fold_mask") = fold_mask);
    out[k] = fold_k;
  }
  return out;

to me it seemed a NamedTuple would be the most comparable Julia construct to use, so I do:

out = Vector{NamedTuple{(:u, :v, :λ_L_max, :fold_mask), Tuple{Float64, Float64, Float64, Matrix{Bool}}}}(undef, num_folds)

for k  1:num_fold
    (; u, v, λ_L_max) = initialize_uv(...)
    out[k] = (; u, v, λ_L_max, fold_mask)
end

return out

which looks a lot better, with the exception of the initalisation of out. Is there anythying more convenient to initialise this vector, or a better pattern altogether?

view this post on Zulip Timothy (Jan 23 2024 at 10:11):

You might prefer using Vector{@NamedTuple{u::Float64, v::Float64, ...}} instead.

view this post on Zulip Mason Protter (Jan 23 2024 at 10:11):

beat me to it :sweat_smile:

view this post on Zulip Mason Protter (Jan 23 2024 at 10:17):

If you don't want to declare the eltype of the vector, one thing you could do if you like is write

using MicroCollections, BangBang
out = EmptyVector()
for k  1:num_fold
    (; u, v, λ_L_max) = initialize_uv(...)
    out = push!!(out, (; u, v, λ_L_max, fold_mask))
end

return out

view this post on Zulip Nils (Jan 23 2024 at 10:17):

Thanks, that saves 12 characters in this case! Still not exactly concise but I guess there can't really be a way around specifing the names and types of all elements of a NamedTuple (the clue is almost in the name...)

What are the pros/cons of this over using a Dict?

view this post on Zulip Mason Protter (Jan 23 2024 at 10:18):

EmptyVector is like a more efficient version of Union{}[]

view this post on Zulip Mason Protter (Jan 23 2024 at 10:20):

and push!!(coll, x) is like

if x <: eltype(coll)
    push!(coll, x)
else
    coll = #make a new version of coll that has a wider eltype
    push!(coll, x)
end

view this post on Zulip Mason Protter (Jan 23 2024 at 10:21):

so you always need to do coll = push!!(coll, x) instead of just push!!(coll, x)

view this post on Zulip Mason Protter (Jan 23 2024 at 10:27):

Okay, actually on second thought, I'm now unsure why I I didn't just suggest you use map actually

view this post on Zulip Mason Protter (Jan 23 2024 at 10:28):

all you need to write is

map(1:num_fold) do k
    (; u, v, λ_L_max) = initialize_uv(...)
    (; u, v, λ_L_max, fold_mask)
end

view this post on Zulip Timothy (Jan 23 2024 at 10:34):

Do you mean this?

map(1:num_fold) do k
    (; u, v, λ_L_max) = initialize_uv(...)
    (; u, v, λ_L_max, fold_mask)
end

view this post on Zulip Timothy (Jan 23 2024 at 10:35):

On that note, I wonder if this compiles equivalently? (assuming initialise_uv doesn't have any extra info you want to ignore)

map(1:num_fold) do k
    (; initialize_uv(...)..., fold_mask)
end

view this post on Zulip Mason Protter (Jan 23 2024 at 11:04):

Do you mean this?

Whoops, yeah I copy-pasted the wrong code section :face_palm:, I've edited my post

view this post on Zulip aplavin (Jan 23 2024 at 17:55):

What are the pros/cons of this over using a Dict?

NamedTuple is typically much lighterweight and more performant in such scenarios, so it's the natural choice indeed.
For even more efficiency, or if you need whole-column access, use StructArrays instead of regular Vector.

And anyway, in this scenario map is the most straightforward solution, as suggested above. Manual out[k] = ... or push!!() is for more complicated processing that doesn't fit map semantics.


Last updated: Nov 22 2024 at 04:41 UTC