Fredrik Bagge Carlson
Thanks for sharing.
I wanted to extend the question a little
and was curious if the detectOutliers()
method could be used to remove rows
from a dataFrame based on row values
of another DataFrame.
An initial approach:
function mysetdiff(y, x)
res = Vector{eltype(y)}(undef, length(y) - length(x))
i = 1
@inbounds for el in y
el ∈ x && continue
res[i] = el
i += 1
end
res
end
Or if possible:
i = (Vector of Row Value Indices)
df2 = df1[[1:(i-1]; (i+1): end], :]
How can I delete rows from DF1
based on rows of DF2 where the
elements of DF2 are outliers in
DF1?
I believe Query.jl has the least
computational expensive method
to approach this with @filter or
@join, would you agree?
@Andrey Oskin
Best,
(deleted)
Not quite sure what you want, but it looks like typical use case for antijoin
https://dataframes.juliadata.org/stable/man/joins/#Database-Style-Joins
But you need a key to join them of course
Last updated: Nov 06 2024 at 04:40 UTC