[R] Re: R-help Digest, Vol 24, Issue 28

From: John Maindonald <john.maindonald_at_anu.edu.au>
Date: Tue 01 Mar 2005 - 08:30:03 EST

You've omitted a comma. races2000 is a data frame, which for purposes of extracting rows behaves like a 2-dimenional object. The following works fine:

   hills2000 <- races2000[races2000$type == 'hill', ]

Additionally, you might like to ponder

   > type <- races2000[names(races2000)=="type"]    > type[1:4]
   Error in "[.data.frame"(type, 1:4) : undefined columns selected

   > length(type)                  # type is a data frame with 1 column
   [1] 1
   > vtype <- unlist(type) # Extract the vector that is the one
                                              # data frame (list) element
   > vtype[1:4]                      # Try also length(vtype)
   type.type1 type.type2 type.type3 type.type4
     "uphill"    "other"    "other"    "relay"

Your syntax (without the comma) does give a result, providing that the dimensions match (the condition must have the same number of elements as races2000 has columns), but it is probably not the result that you want! See further pp.320-321 of the DAAG book.

John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.

On 28 Feb 2005, at 10:07 PM, r-help-request@stat.math.ethz.ch wrote:

> From: Clint Harshaw <charshaw@presby.edu>
> Date: 28 February 2005 1:08:36 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] subsetting data set dimenion problem
>
>
> (See DAAG book, p. 173, ex. 3)
>
> I'm a new user of R, and I'm following the DAAG text. I want to create
> a subset of the races2000 data frame, but get errors because of a
> mismatch of values in some columns:
>
> > library(DAAG)
> > attach(races2000)
> > hills2000 <- races2000[races2000$type == 'hill']
> Error in as.matrix.data.frame(x) : dim<- : dims [product 770] do not
> match the length of object [771]
>
> However, if I follow the solution given, and remove redundant columns
> 1 through 6 and column 11 (which I won't need, since I know they are
> going to have the same value), I don't get the error:
>
> > hills2000 <- races2000[races2000$type == 'hill', -c(1:6,11)]
> > hills2000
> dist climb time timef
> Tiso Carnethy 6.00 2500 0.7822222 0.9191667
> [...]
> Cornalees 5.50 800 0.6183333 NA
> [...]
>
> What is causing the error with my original subsetting? I speculated it
> was related to the NA values, but there is an NA in the resulting
> hills2000, corresponding to the Cornalees hill race.
>
> Thanks,
> Clint
> --
> Clint Harshaw, PhD
> Department of Mathematics
> Presbyterian College
> Clinton SC 29325
>
> EMail: charshaw@presby.edu
> Phone: 864.833.8995
> Fax: 864.938.3769
> Office: Harrington-Peachtree Rm 412
>

John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Mar 01 08:57:04 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:37 EST