From: Duncan Murdoch <murdoch_at_stats.uwo.ca>

Date: Sun 09 Jul 2006 - 07:19:12 EST

>> On 7/8/2006 3:44 PM, justin rapp wrote:

*>>> I apologize for my constant questions but I am new to R and trying to
*

*>>> gain an appreciation for its capabilities. The following task is easy
*

*>>> in Excel and I was hoping somebody could give me a quick explanation
*

*>>> for how it can be acheived in R so I can avoid having to switch
*

*>>> between the two applications.
*

*>>>
*

*>>> How do I find the Summary Statistics in one Vector of the dataframe by
*

*>>> levels in another of the vectors.
*

*>>>
*

*>>> For example, I have the following headings for my data.frame.
*

*>>> Conference
*

*>>> Year Drafted
*

*>>> Height
*

*>>> Weight
*

*>>> Ratio
*

*>>>
*

*>>> I would like to see compute the mean Height, Weight, and Ratio as well
*

*>>> as their variances for each of the years under Year
*

*>>> Drafted(1980-2000). What is the most efficient way of doing this?
*

*>> I think the quickest is
*

*>>
*

*>> by(mydf, mydf$Year, summary)
*

*>>
*

*>> but this won't give you the variance. You'll need your own little
*

*>> function to calculate mean and variance, e.g.
*

*>>
*

*>> mysummary <- function(df) apply(df, 2,
*

*>> function(x) c(mean=mean(x), variance=var(x)))
*

*>>
*

*>> by(mydf, mydf$Year, mysummary)
*

*>>
*

*>> If you don't like the format of the output, you can play around with the
*

*>> mysummary function. It will be applied to each subset of the
*

*>> data.frame, and the results will be put together into a list with one
*

*>> entry per level of mydf$Year.
*

*>>
*

*>>
*

*>> Duncan
*

*>>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jul 09 08:33:51 2006

Date: Sun 09 Jul 2006 - 07:19:12 EST

On 7/8/2006 4:55 PM, justin rapp wrote:

> When I attempt to use the mysummary function, I obtain the following error: > > Error in var(x) : missing observations in cov/cor

var() gives that error if it sees NA values. You can get it to remove them by using

var(x, na.rm = TRUE)

instead of var(x). Whether that makes sense depends on the context of your problem.

Duncan Murdoch

> > When I use: > by(data.logistic,data.logistic$Ydrafted,summary) > > I receive no errors. I cut and pasted your mysummary function directly > into my r console. Should I have made any adjustments to the code? > > jdr > > On 7/8/06, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:

>> On 7/8/2006 3:44 PM, justin rapp wrote:

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jul 09 08:33:51 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Sun 09 Jul 2006 - 10:16:48 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*