Stream: helpdesk (published)

Topic: Traversing Nested Dictionary


view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 14:22):

Hey everyone, I was wondering if there was an efficient ("canonical") way to traverse a dictionary with a JSON like structure. For example, I have a dictionary with an array of dictionaries that sometimes don't have the same keys. I then need to iterate over these dictionaries and check if an specific key has a certain value. For example:

for cell in cells
  if cell["key1"]["key2"]["key3"] == true
     # do stuff
  end
end

The issue is that sometimes the cell won't have either "key1", or "key2" or "key3", so a simple get(cell, key1, false) won't work. Of course, I could always write a bunch of if statements, but I was wondering if there was a smarter and more direct way of doing this.

view this post on Zulip Richard Reeve (Oct 18 2021 at 15:40):

Well, it's hardly elegant, but you can chain gets by putting empty Dicts in instead of false:

if get(get(get(cell, "key1", Dict()), "key2", Dict()), "key3", nothing) == "value"
  # Do stuff
end

Or, perhaps marginally less inelegant:

if cell |> x -> get(x, "key1", Dict()) |> x -> get(x, "key2", Dict()) |> x -> get(x, "key3", nothing) == "value"
  # Do something
end

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 15:44):

Richard Reeve said:

Well, it's hardly elegant, but you can chain gets by putting empty Dicts in instead of false:

if get(get(get(cell, "key1", Dict()), "key2", Dict()), "key3", nothing) == "value"
  # Do stuff
end

Or, perhaps marginally less inelegant:

if cell |> x -> get(x, "key1", Dict()) |> x -> get(x, "key2", Dict()) |> x -> get(x, "key3", nothing) == "value"
  # Do something
end

Thanks, the empty Dict seems like a good idea. Putting it in a function and thus allowing an arbitrary number of keys looks like a nice work around.

view this post on Zulip Andrey Oskin (Oct 18 2021 at 16:00):

You may write your own get function, which supports nested getting.

view this post on Zulip Andrey Oskin (Oct 18 2021 at 16:01):

Is better then Dict(), since it will be less allocating and as such giving less pressure to gc.

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 16:17):

Can you give an example?

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 16:17):

I'm trying to write a get function that parses the nested dict, but I'm not finding a clever way of doing so.

view this post on Zulip Richard Reeve (Oct 18 2021 at 16:47):

julia> function nested_get(dict::Dict, keys::AbstractVector, default = nothing)
         if isempty(keys)
           return dict
         end
         key = popfirst!(keys)
         if haskey(dict, key)
           return nested_get(dict[key], keys)
         end
         return default
       end
nested_get (generic function with 1 method)

julia> function nested_get(non_dict, keys::AbstractVector, default = nothing)
         if isempty(keys)
           return non_dict
         end
         return default
       end
nested_get (generic function with 2 methods)

julia> cells = Dict{String, Any}("a" => 1)
Dict{String, Any} with 1 entry:
  "a" => 1

julia> nested_get(cells, ["key1", "key2", "key3"]) === nothing
true

julia> cells["key1"]=Dict("key2" => Dict("key3"=>1))
Dict{String, Dict{String, Int64}} with 1 entry:
  "key2" => Dict("key3"=>1)

julia> nested_get(cells, ["key1", "key2", "key3"])
1

So it'll return nothing if there's no match at any point, and it'll return the value if there's a full match. Something like that?

view this post on Zulip Richard Reeve (Oct 18 2021 at 16:51):

It's more complicated than what I suggested originally, but Andrey is right that this is a better solution (assuming that's what he had in mind!).

view this post on Zulip Andrey Oskin (Oct 18 2021 at 17:11):

Yes, that was the idea.
Instead of nothing you can provide default value, as in usual get function.

view this post on Zulip Andrey Oskin (Oct 18 2021 at 17:11):

Oh!

view this post on Zulip Andrey Oskin (Oct 18 2021 at 17:11):

You did :-)

view this post on Zulip Richard Reeve (Oct 18 2021 at 17:11):

Yes, I just updated it to do that!

view this post on Zulip Andrey Oskin (Oct 18 2021 at 17:12):

You can use splatting instead of vector, but this is just a matter of taste.

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 18:20):

Thanks @Richard Reeve , it works beautifully. I actually prefer the vector of keys than the splatting. :)
It gives a sense that the order is important (which it is), while just splatting gives me the sense that they are somehow parallel.

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 18:21):

This function should be in Base, it's very neat, and I constantly need something like it.

view this post on Zulip Davi Sales Barreira (Oct 18 2021 at 18:22):

And it fits perfectly as another multiple dispatch for get.

view this post on Zulip Andrey Oskin (Oct 18 2021 at 18:34):

If we do some type piracy, then vector gives funny notation
d[["key1", "key2", "key3"]], while splatting d["key1", "key2", "key3"]

Honestly, I do not know which one is better.


Last updated: Oct 02 2023 at 04:34 UTC