Re: [R] Problems of data processing

From: Jacques VESLOT <jacques.veslot_at_cirad.fr>
Date: Tue 17 Jan 2006 - 16:12:00 EST

OK ! so try this:
merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T)

Florent Bonneu a écrit :

> Indeed,
> X <- c(1,Na,2,3,3,3,6,6)
> Y <- c(1,Na,9,7,7,7,8,8)
>
> I want to obtain one line for each Num. It's not a problem if there
> are several lines for the same place, because my identifier is Num. I
> just want to get X and Y well-informed in an other line for the same
> place. For example, "Num=2" is at the place "x1", like "Num=1", but we
> don't have the coordinates X and Y for "Num=2". Now, the same
> coordinates are well-informed for "Num=1", so i want to retrieve this
> coordinates in my line "Num=2" for my columns X and Y.
>
>
>
> Jacques VESLOT wrote:
>
>> something wrong in X and Y definitions... but this could work:
>>
>> do.call("rbind", lapply(split(toto, toto$Num),
>> function(x) x[which.min(as.POSIXct(strptime(toto$Date, "%d/%m/%y
>> %H:%M"))),]))
>>
>> i don't understand the second query; do you want to keep the first
>> line when there are several lines for the same place ?
>>
>>
>> Florent Bonneu a écrit :
>>
>>> I have two problems for the data processing of my large data base
>>> (50000 rows). For example, a sample is as follows
>>>
>>> Num <- c(1,2,3,4,4,4,5,5)
>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14",
>>> "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")
>>> Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
>>> X <- c(1,””,2,3,3,3,6,6)
>>> Y <- c(1,””,9,7,7,7,8,8)
>>>
>>> toto <- data.frame(Num,Date,Place,X,Y)
>>>
>>> The first problem is to keep one line for each Num with the
>>> “minimum” date. I managed to do it with loops but i would like a
>>> solution without using loops. It will be better for my large data base.
>>>
>>> The other one is to retrieve the coordinates ill-informed. For
>>> example, for the same place “x1”, Num=2 doesn't have X and Y. But,
>>> we have this information for Num=1.
>>>
>>> The example data base must be like this
>>>
>>> Num <- c(1,2,3,4,5)
>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14",
>>> "1/1/04 9:05")
>>> Place <- c("x1","x1","x3","x4","x5")
>>> X <- c(1,1,2,3,6)
>>> Y <- c(1,1,9,7,8)
>>>
>>> toto <- data.frame(Num,Date,Place,X,Y)
>>> Somebody know how to do ?
>>> Thanks.
>>>
>>> Florent Bonneu
>>> Laboratoire de Statistique et Probabilités
>>> bureau 148 bât. 1R2
>>> Université Toulouse 3
>>> 118 route de Narbonne - 31062 Toulouse cedex 9
>>> bonneu@cict.fr <mailto:bonneu@cict.fr>
>>>
>>> ______________________________________________
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide!
>>> http://www.R-project.org/posting-guide.html
>>>
>>>
>>>
>>
>>
>>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jan 17 16:20:53 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:04 EST