From: Robert A LaBudde <ral_at_lcfltd.com>

Date: Mon, 28 May 2007 00:12:37 -0400

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral_at_lcfltd.com

R-help_at_stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 May 2007 - 04:19:49 GMT

Date: Mon, 28 May 2007 00:12:37 -0400

Thanks, Gabor.

I have to say I wouldn't have figured this out easily.

I'd summarize your comments by:

- Remember to use arrays of logicals as indices.
- Remember %in% for combination matches.
- Remember which() to get indices.

It is the small tasks which appear most difficult to figure out in R.

At 10:29 PM 5/27/2007, Gabor wrote:

>On 5/27/07, Robert A. LaBudde <ral@lcfltd.com> wrote:

*>>As I was working through elementary examples, I was using dataset
**>>"plasma" of package "HSAUR".
**>>
**>>In performing a logistic regression of the data, and making the
**>>diagnostic plots (R-2.5.0)
**>>
**>>data(plasma,package='HSAUR')
**>>plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma, family=binomial())
**>>layout(matrix(1:4,nrow=2))
**>>plot(plasma_1)
**>>
**>>I find that data points corresponding to rownames 17 and 23 are
**>>outliers and high leverage.
**>>
**>>I would then like to perform a fit without these two rows.
**>>
**>>In principle this should be easy, using an update() with subset=-c(17,23).
**>>
**>>The problem is that the rownames in this dataset are not ordered,
**>>and, in fact, the relevant rows are 30 and 31, not 17 and 23.
**>>
**>>This brings up the following (elementary?) questions:
**>>
**>>1. How do you reference rows in "subset=" for which you know the
**>>rownames, but not the row numbers?
**>
**>Use a logical vector:
**>
**> rownames(plasma) %in% c(17, 23)
**>
**>>
**>>2. How do you discovery the rows corresponding to particular
**>>rownames? (Using plasma[rownames(plasma)==17,] shows the data, but
**>>NOT the row number!) (Probably the same answer as in Q. 1 above.)
**>
**> which(rownames(plasma) %in% c(17, 23)) # 30, 31
**>
**>>
**>>3. How do you sort (order) the rows of an existing data frame so that
**>>the rownames are in order?
**>
**>
**> plasma[order(as.numeric(rownames(plasma))), ]
*

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral_at_lcfltd.com

Least Cost Formulations, Ltd. URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239 Fax: 757-467-2947

"Vere scire est per causas scire"

R-help_at_stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 May 2007 - 04:19:49 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 28 May 2007 - 04:31:24 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*