Re: [R] Creating binary variable depending on strings of two dataframes

From: Pete Pete <noxyport_at_gmail.com>
Date: Fri, 06 May 2011 08:35:02 -0700 (PDT)

Gabor Grothendieck wrote:
>
> On Tue, Dec 7, 2010 at 11:30 AM, Pete Pete &lt;noxyport_at_gmail.com&gt;
> wrote:

>>
>> Hi,
>> consider the following two dataframes:
>> x1=c("232","3454","3455","342","13")
>> x2=c("1","1","1","0","0")
>> data1=data.frame(x1,x2)
>>
>> y1=c("232","232","3454","3454","3455","342","13","13","13","13")
>> y2=c("E1","F3","F5","E1","E2","H4","F8","G3","E1","H2")
>> data2=data.frame(y1,y2)
>>
>> I need a new column in dataframe data1 (x3), which is either 0 or 1
>> depending if the value "E1" in y2 of data2 is true while x1=y1. The
>> result
>> of data1 should look like this:
>>   x1     x2 x3
>> 1 232   1   1
>> 2 3454 1   1
>> 3 3455 1   0
>> 4 342   0   0
>> 5 13     0   1
>>
>> I think a SQL command could help me but I am too inexperienced with it to
>> get there.
>>

>
> Try this:
>
>> library(sqldf)
>> sqldf("select x1, x2, max(y2 = 'E1') x3 from data1 d1 left join data2 d2
>> on (x1 = y1) group by x1, x2 order by d1.rowid")

> x1 x2 x3
> 1 232 1 1

> 2 3454 1 1
> 3 3455 1 0
> 4 342 0 0
> 5 13 0 1
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

That works pretty cool but I need to automate this a bit more. Consider the following example:

list1=c("A01","B04","A64","G84","F19")

x1=c("232","3454","3455","342","13")
x2=c("1","1","1","0","0")
data1=data.frame(x1,x2)

y1=c("232","232","3454","3454","3455","342","13","13","13","13") y2=c("E13","B04","F19","A64","E22","H44","F68","G84","F19","A01") data2=data.frame(y1,y2)

I want now to creat a loop, which creates for every value in list1 a new binary variable in data1. Result should look like:

x1	x2	A01	B04	A64	G84	F19
232	1	0	1	0	0	0
3454	1	0	0	1	0	1
3455	1	0	0	0	0	0
342	0	0	0	0	0	0
13	0	1	0	0	1	1

Thanks!

--
View this message in context: http://r.789695.n4.nabble.com/Creating-binary-variable-depending-on-strings-of-two-dataframes-tp3076724p3503334.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 06 May 2011 - 17:04:48 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 06 May 2011 - 18:10:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive