Re: [R] R newbie: logical subsets

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed 12 Jul 2006 - 05:06:23 EST

Try this, using the built in anscombe data set:

anscombe[!rowSums(abs(scale(anscombe)) > 2),]

On 7/11/06, Joshua Tokle <jtokle@math.washington.edu> wrote:
> Hello! I'm a newcomer to R hoping to replace some convoluted database
> code with an R script. Unfortunately, I haven't been able to figure out
> how to implement the following logic.
>
> Essentially, we have a database of transactions that are coded with a
> geographic locale and a type. These are being loaded into a data.frame
> with named variables city, type, and price. E.g., trans$city and all
> that.
>
> We want to calculate mean prices by city and type, AFTER excluding
> outliers. That is, we want to calculate the mean price in 3 steps:
>
> 1. calculate a mean and standard deviation by city and type over all
> transactions
> 2. create a subset of the original data frame, excluding transactions that
> differ from the relevant mean by more than 2 standard deviations
> 3. calculate a final mean by city and type based on this subset.
>
> I'm stuck on step 2. I would like to do something like the following:
>
> fs <- list(factor(trans$city), factor(trans$type))
> means <- tapply(trans$price, fs, mean)
> stdevs <- tapply(trans$price, fs, sd)
>
> filter <- abs(trans$price - means[trans$city, trans$type]) <
> 2*stdevs[trans$city, trans$type]
>
> sub <- subset(trans, filter)
>
> The above code doesn't work. What's the correct way to do this?
>
> Thanks,
> Josh
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
>
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jul 12 05:13:20 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 12 Jul 2006 - 06:17:25 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.