[R] Problems of data processing

From: Florent Bonneu <bonneu_at_cict.fr>
Date: Mon 16 Jan 2006 - 22:09:31 EST


I have two problems for the data processing of my large data base (50000 rows). For example, a sample is as follows

Num <- c(1,2,3,4,4,4,5,5)
Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")

Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
X <- c(1,””,2,3,3,3,6,6)
Y <- c(1,””,9,7,7,7,8,8)

toto <- data.frame(Num,Date,Place,X,Y)

The first problem is to keep one line for each Num with the “minimum” date. I managed to do it with loops but i would like a solution without using loops. It will be better for my large data base.

The other one is to retrieve the coordinates ill-informed. For example, for the same place “x1”, Num=2 doesn't have X and Y. But, we have this information for Num=1.

The example data base must be like this

Num <- c(1,2,3,4,5)
Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 9:05") Place <- c("x1","x1","x3","x4","x5")
X <- c(1,1,2,3,6)
Y <- c(1,1,9,7,8)

toto <- data.frame(Num,Date,Place,X,Y)

Somebody know how to do ?
Thanks.

Florent Bonneu
Laboratoire de Statistique et Probabilités bureau 148 bât. 1R2
Université Toulouse 3
118 route de Narbonne - 31062 Toulouse cedex 9 bonneu@cict.fr <mailto:bonneu@cict.fr>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jan 16 22:37:10 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:04 EST