Re: [R] Query about extracting subsets from a table

From: Marc Schwartz <>
Date: Tue 23 Jan 2007 - 17:48:33 GMT

On Tue, 2007-01-23 at 09:28 -0800, lalitha viswanath wrote:
> Hi
> I am trying to process tabular data as follows:
> Data in the input file is of the form
> genome1 genome2 tree-dist log10escore
> Genome1 and genome2 are alphabetic.
> Tree-dist and log10escore are numeric.
> I wish to extract only those rows from this table
> where the log10escore is less than -3.
> data <-read.table(filename);
> data$log10escore = data$log10escore[ data$log10escore
> < -3];
> I would like to use this pruned list of escores to get
> the corresponding genomenames and treedist.
> I did not find anything useful in the FAQs and Notes
> on R for this part of the data extraction.
> As I am just beginning programming in R, I would
> appreciate your input about this.
> Thanks
> L"subset") would lead you to ?subset, where you could do something like:

DF <- subset(YourData, log10escore < -3)

If you just wanted the values of the two other columns, you could also use:

DF <- subset(YourData, log10escore < -3,

             select = c(genomenames, treedist))

One additional alternative is to use which(). This will return the _indices_ of the values that match the criteria. For example:

  Ind <- which(YourData$log10escore < -3)

In that case, you could then use:




These would return vectors of the two columns meeting the criteria.

Which approach you take depends upon what else you may want to do with the data.

See ?which for more information.

HTH, Marc Schwartz mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 24 04:54:11 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 23 Jan 2007 - 18:30:27 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.