Re: [R] How to reference or sort rownames in a data frame

From: Gabor Grothendieck <>
Date: Sun, 27 May 2007 22:29:09 -0400

On 5/27/07, Robert A. LaBudde <> wrote:
> As I was working through elementary examples, I was using dataset
> "plasma" of package "HSAUR".
> In performing a logistic regression of the data, and making the
> diagnostic plots (R-2.5.0)
> data(plasma,package='HSAUR')
> plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma, family=binomial())
> layout(matrix(1:4,nrow=2))
> plot(plasma_1)
> I find that data points corresponding to rownames 17 and 23 are
> outliers and high leverage.
> I would then like to perform a fit without these two rows.
> In principle this should be easy, using an update() with subset=-c(17,23).
> The problem is that the rownames in this dataset are not ordered,
> and, in fact, the relevant rows are 30 and 31, not 17 and 23.
> This brings up the following (elementary?) questions:
> 1. How do you reference rows in "subset=" for which you know the
> rownames, but not the row numbers?

Use a logical vector:

   rownames(plasma) %in% c(17, 23)

> 2. How do you discovery the rows corresponding to particular
> rownames? (Using plasma[rownames(plasma)==17,] shows the data, but
> NOT the row number!) (Probably the same answer as in Q. 1 above.)

  which(rownames(plasma) %in% c(17, 23)) # 30, 31

> 3. How do you sort (order) the rows of an existing data frame so that
> the rownames are in order?

  plasma[order(as.numeric(rownames(plasma))), ] mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 May 2007 - 02:34:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 May 2007 - 04:31:24 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.