Hello Everyone,
using GLM, Lathe, MLBase
I have a dataframe similar to:
DF.A = [1,2,3,4,5,6,7,8]
DF.B = [2,45,53,42,51,34,900,55,36]
I would like to remove the outliers from
the DF.B and am using (for 1st quartile):
first_percentile = percentile(DF.B, 25)
iqr_value = iqr(DF.B)
FirstOut = DF[DF.B .> (first_percentile - 1.5*iqr_value),:]
Similiarly I am using (for the 4th quartile):
fourth_percentile = percentile(DF.B, 75)
iqr_value = iqr(DF.B)
FourthOut = DF[DF.B .< (fourth_percentile + 1.5*iqr_value),:]
For both dataframes, these approaches have not removed the
outliers. Could someone explain how I could adjust my code to
produce a resultant dataframe without outliers in the specified
column (and associated rows).
Thanks,
Last updated: Oct 02 2023 at 04:34 UTC