Stream: helpdesk (published)

Topic: Outlier Removal Using IQR


view this post on Zulip qu bit (Mar 15 2021 at 14:21):

Hello Everyone,

using GLM, Lathe, MLBase

I have a dataframe similar to:

DF.A = [1,2,3,4,5,6,7,8]
DF.B = [2,45,53,42,51,34,900,55,36]

I would like to remove the outliers from
the DF.B and am using (for 1st quartile):

    first_percentile = percentile(DF.B, 25)
    iqr_value = iqr(DF.B)
    FirstOut = DF[DF.B  .> (first_percentile - 1.5*iqr_value),:]

Similiarly I am using (for the 4th quartile):

fourth_percentile = percentile(DF.B, 75)
    iqr_value = iqr(DF.B)
    FourthOut = DF[DF.B  .< (fourth_percentile + 1.5*iqr_value),:]

For both dataframes, these approaches have not removed the
outliers. Could someone explain how I could adjust my code to
produce a resultant dataframe without outliers in the specified
column (and associated rows).

Thanks,


Last updated: Oct 02 2023 at 04:34 UTC