I posted this on Slack helpdesk yesterday but I think Zulip lends itself better to slightly more open conceptual questions and I'll try and simplify the question (at the risk of creating an X-Y problem):
I'm trying to leverage multiple dispatch to write a hierarchy of functions for a package of mine (TreatmentPanels.jl) with the ultimate goal of providing a sort of "mini-DSL" to the user.
Basically the user can specify treatment units (identified by a String
/Symbol
identifier) and time periods in which they are treated (either TimeType
or Int
). I'm trying to write a dispatch hierarchy which breaks up more complex combinations of units and time periods into simple one unit -> one time period components with which I then call my "kernel" function.
This all works reasonably well but I'm hitting a wall with the most general case which is this:
julia> typeof(["b" => Date(1991), "d" => Date(1991) => Date(1992)])
Vector{Pair{String, Any}}
so if there are different units with some of them being assigned only one treatment period (which in my mini-DSL means "all periods after this date") and others a start/end range, type inference taps out and turns the right hand side of my Pair
s into Any
. That means my whole dispatch house of cards falls down, as I can't define a "fallback" method for Any
on the right hand side because I need to distinguish between the Date(1991)
and Date(1991) => Date(1992)
case.
What do I do? Is this just a bad design/idea in general? Or should I fall back to if right_hand_side isa Date ...
"manual" dispatch?
yeah that's a interesting problem. So basically you want a single date and a date range to be the same sort of object right?
Does this hierarchy need to be extensible?
I don't feel like I really understand what you're trying to do here, but the first thing that comes to mind is that you probably simply can't do anything like that with array construction syntax. Arrays will be forced to broaden the type and of course you will wind up with Any
, that's just how arrays work in Julia. If you want to use array construction syntax you'll have to pipe stuff through functions which decide on the appropriate types first. At most you can use a macro to make this look like array construction syntax.
Yes I think ultimately my issue boils down to
julia> [1=>2, 1]
2-element Vector{Any}:
1 => 2
1
when I need
julia> Union{Pair{Int,Int}, Int}[1=>2, 1]
2-element Vector{Union{Int64, Pair{Int64, Int64}}}:
1 => 2
1
Because, Mason to your question, the underlying function I want to dispatch to works differently for String => Date
vs String => Pair{Date, Date}
But maybe I just need to change the API to have multiple arguments for the multiple unit case.
And maybe I should have been clearer: my "mini-DSL" is supposed to allow the user to specify:
"A" => Date(2000) # single unit treated for all periods from 2000 onwards
"A" => Date(2000) => Date(2005) # single unit treated from 2000-2005
"A" => [Date(2000) => Date(2005), Date(2008) => Date(2010), Date(2015) # single unit treated from 2000-2005, 2008-2010, and all periods from 2015 onwards
["A", "B"] => Date(2000) # two units treated from 2000 onwards
You can't make that type stable (or even promote to the narrowest union type) using array construction syntax. You either have to use a tuple, make them arguments, or make a macro.
But currently I write the "multiple units, different treatment timings" case as:
["A" => [Date(2000) => Date(2005), Date(2008)], "B" => [Date(2001) => Date(2006), Date(2010)]]
You may also be interested in IntervalSets.jl which has nice interval types with the convenient ..
syntax.
Yes one can quarrel about the "represent from-to as a pair of dates", but the issue is the same if I use date ranges. I'll think about whether Intervals make sense (with dates things can get a bit tricky because often the interval needs to be specified (monthly, daily, weekly...) and so currently I just use start and end points and capture any dates in between, irrespective at what intervals)
I'll probably have to consider writing the multiple unit case as separate arguments. My function currently takes an input matrix which it mutates and then the unit/time period specification as the second argument.
I suppose if I change
my_fun(X, ["A" => [2000 => 2005, 2008], "B" => [2001 => 2006, 2010]])
which doesn't work because of the resulting Vector{Any}
to
my_fun(X, "A" => [2000 => 2005, 2008], "B" => [2001 => 2006, 2010])
I can get a way with it, just not really sure how to write that (and whether slurping those multiple args will give me the same problem?)
No, that will not have the same problem, that will result in a Tuple
which has type slots for each component. The only problem with that is if you have a very large number of arguments it can get very expensive to compile. You can also use a tuple if you'd like, it is equivalent.
But you still have arrays which mix Pair{Int,Int}
with Int
, I'd change those to a tuple as well. They seem like they are guaranteed short, so there should be no problem using a tuple there.
For example
julia> ("A" => (2000=>2005, 2008), "B" => (2001=>2006, 2010)) |> typeof
Tuple{Pair{String, Tuple{Pair{Int64, Int64}, Int64}}, Pair{String, Tuple{Pair{Int64, Int64}, Int64}}}
There are annoyingly many foot-guns when it comes to actually iterating over tuples in Julia however. That's potentially a reason to convert them to (properly typed) arrays later, but I'm still not really clear on what you're trying to do so I'm not sure if that's appropriate here.
I suppose an issue with the tuples is that in my use case, functionally [2000=>2005, 2008]
and [2000=>2007, 2009=>2010, 2012]
are equivalent, so dispatching on a Vector{Union{Date, Pair{Date, Date}}}
is a useful thing to do, while in the tuple design those two things would have a different type?
Yes they would be different, so you'd probably need some kind of post-processing step.
It's starting to sound to me a bit like what you want is to create an appropriate struct for the arguments. There are still potential type promotion issues with it, but if you can create one with appropriate union types you can still pull it off
struct Arg
name::String
inner::Vector{Union{Date,Pair{Date,Date}}}
end
or whatever (I think that's probably not quite what you're looking for but might give you an idea). You could then make your argument a Vector{Arg}
or whatever it is you want.
Last updated: Nov 06 2024 at 04:40 UTC