Re: [R] Normalization and missing values

From: Chris Bergstresser <chris_at_subtlety.com>
Date: Thu 14 Apr 2005 - 12:05:46 EST

    I'd just like to thank everyone who wrote in in response to my questions -- it's been greatly helpful, and appreciated.

Jonathan Baron wrote:
> On 04/13/05 11:36, Chris Bergstresser wrote:
> First, I didn't see a function in R which does normalization -- did
> I miss it? What's the best way to do it?
>
> Look at scale(). Might be what you mean.

    Yeah; I should have remembered that. I did search the help files for "normalization" and "normalize" but that isn't in the help files. Somewhat oddly, I think, since it's exactly what "scale" is doing.

> But, in general, the "right" way
> to deal with missing data depends on the assumptions you make.
> As a novice, I found the following article to be helpful:
>
> Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of
> the state of the art. Psychological Methods, 7, 147-177.

    This article is great; thanks for providing it. The authors recommend either using "ML Estimation" or "Multiple Imputation" to fill in the missing data. They don't talk much about which is better for certain situations, however.

    I don't think my data are particularly sensitive to the method I use -- I've got about 1,100 cases, with 85 variables, and there are only about 1,000 missing values overall, spread pretty evenly across the data file.

    Are there any recommendations for specific packages? "transcan()" and "aregImpute()" look promising; based on the documentation (and what I can understand from it) I'm assuming they both provide Multiple Imputation?


R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Apr 14 12:13:40 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:10 EST