Is there a method for efficiently inserting a large number of elements into a dictionary?
I think your best bet is merge
:
julia> d2 = Dict(900:1100 .=> rand.());
julia> @btime push!(d1, $d2...) setup=(d1 = Dict(1:1000 .=> rand.()));
456.000 μs (10716 allocations: 496.19 KiB)
julia> @btime merge!(d1, $d2) setup=(d1 = Dict(1:1000 .=> rand.()));
2.711 μs (0 allocations: 0 bytes)
Unless I'm missing something here, I think the problem is that this requires you to build a second dictionary. I'm not entirely sure what happens on dictionary creation, but it probably is doing unnecessary work building that second dicitonary. What I'm looking for is a mass insertion of unhashed stuff without constructing an extra dict
I think looping through d[k] = v
is your best bet. The time to insert a single element is ~10 ns, and I'm not sure there's any way to escape linear scaling on the number of inserted elements.
julia> function mergepairs!(d, p)
for (k, v) in p
d[k] = v
end
end
mergepairs! (generic function with 1 method)
julia> d3 = 900:1100 .=> rand.();
julia> @btime mergepairs!(d1, d3) setup=(d1 = Dict(1:1000 .=> rand.()));
2.256 μs (0 allocations: 0 bytes)
Yes I'm pretty sure there is no optimised routine implemented in Base
for bulk insertion. And I'm not sure what that would look like... maybe doing better when multiple keys hash the same but it's not clear to me that the resulting book-keeping would be better than sequential insert.
Last updated: Nov 06 2024 at 04:40 UTC