Is there a way to filter different sized DataFrames and keep only the rows of both DataFrames where there are matching values in a specific column? E.g.
Given
df1_test = DataFrame(
:col1 => ["low", "med", "high"],
:col2 => ["a1", "b1", "c1"]
)
df2_test = DataFrame(
:col1 => ["med"],
:col2 => ["b2"]
)
I would like to return only the "med"
row of each DataFrame since only "med"
of :col1
are matching
df1_new = DataFrame(
:col1 => ["med"],
:col2 => ["b1"]
)
df2_new = DataFrame(
:col1 => ["med"],
:col2 => ["b2"]
)
subset(d, :col1 => ByRow(==("med")))
or do you mean innerjoin(l,r; on=:col1)
I am looking for a more generic approach. One that will work if "low"
and "med"
are matching, for example. innerjoin
might do that, but this gives me an error
innerjoin(df1_test, df2_test; on=:col1)
ArgumentError: Duplicate variable names: :col2. Pass makeunique=true to make them unique using a suffix automatically.
var"#make_unique!#1"(::Bool, ::typeof(DataFrames.make_unique!), ::Vector{Symbol}, ::Vector{Symbol})@utils.jl:83
#make_unique#2@utils.jl:105[inlined]
#Index#5@index.jl:27[inlined]
var"#DataFrame#188"(::Bool, ::Bool, ::Type{DataFrames.DataFrame}, ::Vector{AbstractVector}, ::Vector{Symbol})@dataframe.jl:339
compose_inner_table(::DataFrames.DataFrameJoiner, ::Bool, ::typeof(identity), ::typeof(identity))@composer.jl:135
var"#_join#572"(::Symbol, ::Symbol, ::Bool, ::Nothing, ::Tuple{Bool, Bool}, ::typeof(identity), ::typeof(identity), ::Symbol, ::typeof(DataFrames._join), ::DataFrames.DataFrame, ::DataFrames.DataFrame)@composer.jl:400
#innerjoin#573@composer.jl:576[inlined]
top-level scope@Local: 1[inlined]
You can do something like:
julia> matching = df1_test.col1 ∩ df2_test.col1
1-element Vector{String}:
"med"
julia> df1_new = subset(df1_test, :col1 => ByRow(in(matching)))
1×2 DataFrame
Row │ col1 col2
│ String String
─────┼────────────────
1 │ med b1
julia> df2_new = subset(df2_test, :col1 => ByRow(in(matching)))
1×2 DataFrame
Row │ col1 col2
│ String String
─────┼────────────────
1 │ med b2
You can pass makeunique=true
like it says
Thank you! Both of those solutions work!
Dale Black has marked this topic as resolved.
Last updated: Nov 06 2024 at 04:40 UTC