Re: [R] Output of order() incorrectly ordered?

From: Paul Hiemstra <p.hiemstra_at_geo.uu.nl>
Date: Tue, 25 Mar 2008 11:06:15 +0100

Hi Shirley,

You can use the function sort_df() from the reshape package to sort an entire data.frame based on one column.

cheers,
Paul

Shirley Wu wrote:
> Hello,
>
> I have a data frame consisting of four columns and would like to sort
> based on the first column and then write the sorted data frame to a
> file.
>
> > df <- read.table("file.txt", sep="\t")
> where file.txt is simply a tab-delimited file containing 4 columns of
> data (first 2 numeric, second 2 character). I then do,
>
> > df_ordered <- df[order(df$V1), ]
>
> OR, I assume equivalently,
>
> > df_ordered <- df[ do.call(order, df), ]
>
> and then,
>
> > write.table(df_ordered, file="newfile.txt", ...)
>
> The input data file looks like this:
>
> 0.083044 375.276 680220 majority
> 5.50816e-09 2.48914e-05 26377 conformation
> 0.000169618 0.766505 1546938 interaction
> 3.90425e-05 0.176433 1655338 vitamin
> 0.0378182 170.9 1510941 array
> 3.00359e-07 0.00135732 69421 oligo(dT)-cellulose
> 1.01517e-13 4.58754e-10 699918 elastase
> ...
>
> I'd like the output file to look the same except sorted by the first
> column. The output of the commands above give me something that is
> sorted in some places but not sorted in others:
>
> [sorted section]
> ...
> 1.87276e-07 0.000846299 1142090 vitamin K
> 1.89026e-07 0.000854207 917889 leader peptide
> 1.90884e-07 0.000862605 31206 s
> 0.00536062 24.2246 1706420 prevent
> 5.42648e-05 0.245223 1513041 measured
> 5.42648e-05 0.245223 1513040 measured
> 0.019939 90.1044 12578 fly
> 0.00135512 6.12377 61688 GPI
> 0.00124421 5.62257 681915 content
> 0.0128271 57.9655 681916 estimated
> ...
> [sorted section]
> ...
> [unsorted section]
> ...
> [etc]
>
> I'm not sure if this is a problem with the input data or with order()
> or what. I am only doing this in R because many of my numeric values
> are expressed in exponential notation and UNIX sort does not handle
> this to my knowledge, but this behavior baffles me. I am pretty new
> to R so it's possible I'm missing something.
>
> Any insight would be greatly appreciated!
>

> Thanks,
> -Shirley
> graduate student
> Stanford University
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: 	+31302535773
Fax:	+31302531145
http://intamap.geo.uu.nl/~paul

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 25 Mar 2008 - 10:15:01 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 25 Mar 2008 - 11:30:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive