# RE: [R] Temporal Analysis of variable x; How to select the outlier threshold in R?

• bogdan romocea <br44114@yahoo.com> wrote:

> You have financial data and want to throw away some
> outliers??
> Why would you ever do this?

I would select an outlier threshold, to extract a subset of the data "x" that had significant difference in financial contributions in a range of two years. "x" represents a variable for the amount of dollar value change in allocations to an account over a 2 year period.

> First of all, I'd suggest you pay close attention to
> what the data is
> trying to say. Maybe your distribution is not normal
> after all (see
> tests for normality etc). Maybe you shouldn't force
> assumption upon the data.
A plot off qq.plot(x) or qqnorm(x) indicated that the data was not normally distributed. I also used shapiro.test() which gave a p-value << 0.05.

In order to select the outlier threshold, I ended up using the following : outlier_threshold <- qauntile(x, 3/4) + 1.5* IQR(x)

-Melanie

> For a financial data set with large variance, I'm
> trying to find the
> outlier threshold of one variable "x" over a two
> year period. I
> qqplot(x2001, x2002) and found a normal
> distribution. The latter part
> of
> the normal distribution did not look linear though.
> Is there a suitable
> method in R to find the outlier threshold of this
> variable from 2001
> and
> 2002 in R?
