Stream: helpdesk (published)

Topic: Outlier Removal


view this post on Zulip qu bit (Mar 18 2021 at 04:56):

qu bit

Apply the filter & percentile method to
remove outliers from the IQR as:

begin
first_perc = percentile(DF.B, 25)
last_perc = percentile(DF.B,75)
IQR_value = iqr(DF.B)
DF_NO = filter(x -> x.B .> first_perc - 1.5*IQR_value &&
             x.B .< last_perc + 1.5*IQR_value  , DF)
end

You can then check the :std using the describe() method
to see what improvement might have been made from the
process (es).

view this post on Zulip Fredrik Bagge Carlson (Mar 18 2021 at 05:30):

Have a look at
https://github.com/jbytecode/LinRegOutliers

view this post on Zulip qu bit (Mar 24 2021 at 21:40):

Fredrik Bagge Carlson

Thanks for sharing.

I wanted to extend the question a little
and was curious if the detectOutliers()
method could be used to remove rows
by index from a dataFrame where the
outlier's were detected (i.e. via Cook's
Distance).

Best,

@Andrey Oskin

view this post on Zulip qu bit (Mar 25 2021 at 00:10):

Fredrik Bagge Carlson

Could you provide an example how this could
be applied to the question I posted?

Thank you,


Last updated: Oct 02 2023 at 04:34 UTC