Re: [R] lookup not working properly

From: Dimitri Liakhovitski <dimitri.liakhovitski_at_gmail.com>
Date: Tue, 12 Apr 2011 12:01:03 -0400

Thank you, Sarah. This seems to be working:

a=c("ba ba","ca ca","da da", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","ma ma")
b=c("ba ba","ca ca","OTHER", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","OTHER")
myref<-data.frame(a=a, b=b)

myref$a<-as.character(myref$a)
myref$b<-as.character(myref$b)
(myref);str(myref)

for.mydata<-c(rep("ba ba",3),rep("ca ca",3),rep("da da",3),rep("lake lake, a",3),
  rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama ca, b",3),rep("ma ma",3))
temp<-data.frame(d=for.mydata)
temp$d<-as.character(temp$d)
(temp);str(temp)

# temp$b<-myref[temp$d,2]
# (temp)

newdata <- merge(myref, temp, by.x="a", by.y="d", all.x=FALSE, all.y=TRUE) (newdata)
dim(newdata)
(myref)

Dimitri

On Tue, Apr 12, 2011 at 11:42 AM, Sarah Goslee <sarah.goslee_at_gmail.com> wrote:
> Dimitri,
>
> It isn't clear to me exactly what you are trying to do, but this might
> be closer.
> Note the stringsAsFactors argument I added to data.frame: I don't think you
> are likely to want factors for this application. Also, it's a bad idea
> to create a
> variable named c since that is the name of a function.
>
> # my reference data frame:
> myref<-data.frame(a=c("ba ba","ca ca","da da", "lake lake, a", "lake
> lake, b","lake of","lama ca, a","lama ca, b","ma ma"), b=c("ba ba","ca
> ca","OTHER", "lake lake, a", "lake lake, b","lake of","lama ca,
> a","lama ca, b","OTHER"), stringsAsFactors=FALSE)
>
> # my data:
> temp<-data.frame(c=c(rep("ba ba",3),rep("ca ca",3),rep("da
> da",3),rep("lake lake, a",3),
>  rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama
> ca ,b",3),rep("ma ma",3)), stringsAsFactors=FALSE)
>
> newdata <- merge(myref, temp, by.x="a", by.y="c", all.x=FALSE, all.y=TRUE)
>
> Sarah
>
> On Tue, Apr 12, 2011 at 11:17 AM, Dimitri Liakhovitski
> <dimitri.liakhovitski_at_gmail.com> wrote:
>> Hello!
>>
>> Below is my exmample. "myref" is my reference data frame with columns a and b.
>> "temp" is my data with column c analogous to column a in "myref".
>> I am trying to create a new variable b - in "temp" - that matches
>> values from b in "myref" to values in c. If you look at the resulting
>> data frame (temp - at the bottom), you'll notice that rows 19-24 are
>> incorrect.
>> How could one fix it?
>> Thanks a lot!
>>
>> # my reference data frame:
>> a=c("ba ba","ca ca","da da", "lake lake, a", "lake lake, b","lake
>> of","lama ca, a","lama ca, b","ma ma")
>> b=c("ba ba","ca ca","OTHER", "lake lake, a", "lake lake, b","lake
>> of","lama ca, a","lama ca, b","OTHER")
>> myref<-data.frame(a=a, b=b)
>> (myref)
>>
>> # my data:
>> c<-c(rep("ba ba",3),rep("ca ca",3),rep("da da",3),rep("lake lake, a",3),
>>  rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama
>> ca ,b",3),rep("ma ma",3))
>> temp<-data.frame(c=c)
>> (temp)
>>
>> ### Matching:
>> temp$b<-myref[temp$c,"b"]
>> (temp)
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 12 Apr 2011 - 16:03:03 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 12 Apr 2011 - 16:20:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive