Hello Coders:
I have two tables. Table 1:
T1.A = [1,2,3,4,5]
T1.B = ["Fri", "Sat", "Wed", "Fri", "Mon"]
And Table 2:
T2.A = ["Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday"]
T2.B = [0.25, 0.35, 0.50, 0.15, 0.30, 0.75, 0.65]
How can I create a new column in T1 as T1.C that
is populated with T2.B values where T1.B matches?
The general Query.jl approach might be:
T1.C = @from i in T1 begin
@join j in T2 on i.a T1.B equals startswith(T2.A, i, n=1:3)
@select {T1.A,T1.B,T2.A,T2.B}
@collect Dataframe
Is there a way to execute this using DataFrames
and or some higher-order function?
Thank you,
Best way to proceed is to build temporary column and join on it.
DataFrames documentation do not have information about joining with random conditions. And it is considered bad practice anyway, since it generates nested loops.
Thank you for this @Andrey Oskin .
Did you examine the Query.jl SQL-LINQ
code I provided?
Nope, but I suppose they went cross-join + filter road. It's not much you can do, if you have no information about keys.
Standard way to join is sorting over keys with the following double cursor iteration.
So it is O(n log(n))
If you do not know anything in advance, then you have to take all possible pairs and it is O(n^2)
operations (assuming that dataframes has roughly the same size).
So it is either speed or convenience.
So, you can do something like this
using DataFrames
using Chain
T1 = DataFrame("A" => [1,2,3,4,5], "B" => ["Fri", "Sat", "Wed", "Fri", "Mon"])
T2 = DataFrame("A" => ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"], "B" => [0.25, 0.35, 0.50, 0.15, 0.30, 0.75, 0.65])
@chain T2 begin
transform(:A => (y -> (x -> x[1:3]).(y)) => :A3)
innerjoin(T1, on = [:A3 => :B], makeunique = true)
select(["A_1", "A3", "A", "B"])
rename("A_1" => "T1A", "A3" => "T1B", "A" => "T2A", "B" => "T2B")
end
5×4 DataFrame
Row │ T1A T1B T2A T2B
│ Int64 String String Float64
─────┼───────────────────────────────────
1 │ 5 Mon Monday 0.25
2 │ 3 Wed Wednesday 0.5
3 │ 1 Fri Friday 0.3
4 │ 4 Fri Friday 0.3
5 │ 2 Sat Saturday 0.75
Andrey Oskin said:
Nope, but I suppose they went cross-join + filter road. It's not much you can do, if you have no information about keys.
Yes -- this was my code. However, the startswith() method does not work
in its current placement. I am glad to some level it could convey the issue.
Yes -- this is excellent @Andrey Oskin .
Thank you.
@Andrey Oskin
In this example, what does the @transform line do?
And how did A mutate to A_1?
Last updated: Nov 06 2024 at 04:40 UTC