Re: [R] Help with R

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Thu 05 May 2005 - 20:34:43 EST

"Angus Repper" <arepper@hotmail.com> writes:

> Hello
>
>
>
> I am a long-time SAS user, but am new to R. I specifically am looking for
> information pertaining to generating graphics for web output. I would like
> to create dynamic graphics (in the form of generalized reports) for my web
> site that is written with php and mysql. Is 'R' capable of doing
> this?

Yes, people have done that. I'm not the one to ask for the details, but it comes up on the mailing lists from time to time (hint: we have archives...). I gather that the hardest part is to get the bitmapped graphics to look right.

> I
> heard that 'R' does not do a very good job at handling large datasets, is
> this true?

Yes, with qualifications: R stores entire data sets in memory, which is a disadvantage for procedures that can be implemented using sequential file access. However, these days PCs routinely ship with more RAM than we had on our harddisks 5 years ago. The benefit of R is that it allows nonsequential or multipass procedures to be specified simply: R's x - mean(x) in SAS would be PROC MEANS followed by a DATA step (there are various other options, I'm sure, but none involving a single DATA step).

For some statistical procedures, SAS also needs to store data in memory, which makes the comparison more of a toss-up. R has generally a bit of a cavalier attitude towards conserving memory, so often runs into memory limitations more quickly, but carefully coded routines like the lmer function can handle considerably larger data sets than PROC MIXED via the use of sparse-matrix techniques.

Both systems are victims of the curse of the rectangular data set to some extent. Prototypically, you record the sex of a rat along with every single measurement on it, as if the rat could change sex at millisecond resolution. This probably applies to all current statistical systems, but there is some hope that R's more flexible data structures can be leveraged to better handle multilevel data. (Cue Probabilistic Relational Models a.m. Getoor et al., which Peter Green brought up at the recent gR meeting.)

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Thu May 05 20:40:43 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:35 EST