Re: [R] how to merge within range?

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Sat, 14 May 2011 14:47:08 -0400

On May 14, 2011, at 2:27 PM, William Dunlap wrote:

> You could use findInterval() along with a trick with c(rbind(...)):
>
>> i <- findInterval(x=df.1$time, vec=c(rbind(df.2$from, df.2$to)))
>> i
> [1] 1 1 1 2 3 3 3 5 5 6

That's nice. I was working on a slightly different "trick"

findInterval( df.1[,1],t(df.2[,1:2]))
  [1] 1 1 1 2 3 3 3 5 5 6

I was then trying to get the right indices with (.)'%%' 2 and (.) '%/ %' 2

> The even-valued outputs would map to NA's, the odds
> to value[(i+1)/2], but you can use the c(rbind(...)) trick again:
>
>> c(rbind(df.2$value, NA))[i]
> [1] 1 1 1 NA 3 3 3 5 5 NA

I'd like to understand that. Maybe, maybe... ah, got it. At first I didn't realize those were the final answers since they looked like indices. My t(.) trick doesn't generalize as well.

My earlier suggestion tht two merges woul do it was based on my erroneous interpretation of the example, since I thought the task was to match on the end points of the intervals.

>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>> -----Original Message-----
>> From: r-help-bounces_at_r-project.org
>> [mailto:r-help-bounces_at_r-project.org] On Behalf Of René Mayer
>> Sent: Saturday, May 14, 2011 11:06 AM
>> To: David Winsemius
>> Cc: r-help_at_r-project.org
>> Subject: Re: [R] how to merge within range?
>>
>> thanks David and Ian,
>> let me make a better example as the first one was flawed
>>
>> df.1=data.frame(round((1:10)*100+rnorm(10)), value=NA)
>> names(df.1) = c("time", "value")
>> df.1
>> time value
>> 1 101 NA
>> 2 199 NA
>> 3 301 NA
>> 4 401 NA
>> 5 501 NA
>> 6 601 NA
>> 7 700 NA
>> 8 800 NA
>> 9 900 NA
>> 10 1000 NA
>>
>> # from and to define ranges within time,
>> # note that from and to may not match the numbers given in time
>> df.2=data.frame(from=c(99,500,799),to=c(303,702,950), value=c(1,3,5))
>> df.2
>> from to value
>> 1 99 303 1
>> 2 500 702 3
>> 3 799 950 5
>>
>> what I want is:
>> time value
>> 1 101 1
>> 2 199 1
>> 3 301 1
>> 4 401 NA
>> 5 501 3
>> 6 601 3
>> 7 700 3
>> 8 800 5
>> 9 900 5
>> 10 1000 NA
>>
>> @David I don't know what you mean by 2 merges,
>> René
>>
>>
>>
>>
>>
>> Zitat von "David Winsemius" <dwinsemius_at_comcast.net>:
>>
>>>
>>> On May 14, 2011, at 9:16 AM, Ian Gow wrote:
>>>
>>>> If I assume that the third column in data.frame.2 is named
>> "val" then in
>>>> SQL terms it _seems_ you want
>>>>
>>>> SELECT a.time, b.val FROM data.frame.1 AS a LEFT JOIN
>> data.frame.2 AS b ON
>>>> a.time BETWEEN b.start AND b.end;
>>>>
>>>> Not sure how to do that elegantly using R subsetting/merge,
>>>
>>> Huh? It's just two merge()'s (... once you fix the error in
>> the example.)
>>>
>>> --
>>> David
>>>
>>>> but you might
>>>> try a package that allows you to use SQL, such as sqldf.
>>>>
>>>>
>>>> On 5/14/11 8:03 AM, "David Winsemius"
>> <dwinsemius_at_comcast.net> wrote:
>>>>
>>>>>
>>>>> On May 14, 2011, at 8:12 AM, René Mayer wrote:
>>>>>
>>>>>> Hello,
>>>>>> how can one merge
>>>>>
>>>>> And what happened when you typed:
>>>>>
>>>>> ?merge
>>>>>
>>>>>> two data frames when in the second data frame one column
>> defines the
>>>>>> start values
>>>>>> and another defines the end value of the to be merged range.
>>>>>> data.frame.1
>>>>>> time ...
>>>>>> 13
>>>>>> 24
>>>>>> 35
>>>>>> 46
>>>>>> 55
>>>>>> ...
>>>>>> data.frame.2
>>>>>> start end
>>>>>> 24 37 ?h? ?
>>>>>> ...
>>>>>>
>>>>>> should result in this
>>>>>> 13 NA
>>>>>> 24 ?h?
>>>>>> 35 ?h?
>>>>>> 46 NA
>>>>>> 55
>>>>>> ?
>>>>>
>>>>> And _why_ would that be?
>>>>>
>>>>>
>>>>>> thanks,
>>>>>> René
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help_at_r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained,
>> reproducible code.
>>>>>
>>>>> David Winsemius, MD
>>>>> West Hartford, CT
>>>>>
>>>>> ______________________________________________
>>>>> R-help_at_r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>>
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

David Winsemius, MD
West Hartford, CT



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 14 May 2011 - 18:50:06 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 14 May 2011 - 19:40:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive