# Re: [R] R newbie: logical subsets

From: Greg Snow <Greg.Snow_at_intermountainmail.org>
Date: Thu 13 Jul 2006 - 05:20:02 EST

iris2 <- iris
iris2\$m <- ave(iris2\$Sepal.Length, iris2\$Species) iris2\$s <- ave(iris2\$Sepal.Length, iris2\$Species, FUN=sd)

iris2 <- transform(iris2, z= (Sepal.Length-m)/s)

iris2.2 <- subset(iris2, abs(z) < 2)

aggregate(iris2.2, list(iris2.2\$Species), FUN=mean)

```--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow@intermountainmail.org
(801) 408-8111

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch
[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Gabor
Grothendieck
Sent: Tuesday, July 11, 2006 1:06 PM
To: Joshua Tokle
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R newbie: logical subsets

Try this, using the built in anscombe data set:

anscombe[!rowSums(abs(scale(anscombe)) > 2),]

On 7/11/06, Joshua Tokle <jtokle@math.washington.edu> wrote:

> Hello!  I'm a newcomer to R hoping to replace some convoluted database

> code with an R script.  Unfortunately, I haven't been able to figure

> out how to implement the following logic.
>
> Essentially, we have a database of transactions that are coded with a
> geographic locale and a type.  These are being loaded into a
> data.frame with named variables city, type, and price.  E.g.,
> trans\$city and all that.
>
> We want to calculate mean prices by city and type, AFTER excluding
> outliers.  That is, we want to calculate the mean price in 3 steps:
>
> 1. calculate a mean and standard deviation by city and type over all
> transactions 2. create a subset of the original data frame, excluding
> transactions that differ from the relevant mean by more than 2
> standard deviations 3. calculate a final mean by city and type based
> on this subset.
>
> I'm stuck on step 2.  I would like to do something like the following:
>
> fs <- list(factor(trans\$city), factor(trans\$type)) means <-
> tapply(trans\$price, fs, mean) stdevs <- tapply(trans\$price, fs, sd)
>
> filter <- abs(trans\$price - means[trans\$city, trans\$type]) <
>             2*stdevs[trans\$city, trans\$type]
>
> sub <- subset(trans, filter)
>
> The above code doesn't work.  What's the correct way to do this?
>
> Thanks,
> Josh
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
>

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help