[R] fuzzy merge

From: ravi <rv15i_at_yahoo.se>
Date: Wed, 09 Apr 2008 08:53:00 +0000 (GMT)


Hi,
I would like to merge two data frames. It is just that I want the merging to be done with some kind of a fuzzy criterion. Let me explain. My first data frame looks like this :

ID1                     time1                                dt            
1                        2008-01-02 13:11                10
2                        2008-01-02 14:20                20
3                        2008-01-02 15:42                30
4                        2008-01-02 16:45                40
5                        2008-01-02 17:42                50
6                        2008-01-02 20:40                60


My second data frame :

ID2                        time2                                d1
101                        2008-01-02 14:29                75
102                        2008-01-02 17:55                105
103                        2008-02-07 20:01                8



I want the merging to be done such that time2 is in the range between time1 and (time1+15 min). That is, my merged data frame should be :

ID1                     time1                                    time2                                                                  
2                        2008-01-02 14:20                2008-01-02 14:29                                     
5                        2008-01-02 17:42                2008-01-02 17:55


My data frames have thousands of records. If the two data frames are d1 and d2,

d3<-merge(d1,d2,by.x=time1,by.y=time2)
will work only for exact matching. One possible option is to match the times for the date and hour times only (by filtering away the minute data). But this is only a partial solution as I am not interested in data where the time difference is more than 15 minutes.

How can I make the merge to work for fuzzy matching? Would it be easier to convert the times into data classes? Or, it better to treat them as strings and use regular expresssions for doing the matching?

I would appreciate any help that I can get. Thanking You,
Ravi



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 09 Apr 2008 - 10:51:21 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Apr 2008 - 13:30:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive