Re: [R] merge with origin information in new variable names

From: Phil Spector <spector_at_stat.berkeley.edu>
Date: Mon, 25 Apr 2011 10:25:56 -0700 (PDT)

Eric -

     As others have said, you should change the names of the variables in the data frames before you merge them. Here's one implementation of that idea:

    DF.wave.1 <- data.frame(id=1:10,var.A=sample(letters[1:4],10,TRUE))
    DF.wave.2 <- data.frame(id=1:10,var.M=sample(letters[5:8],10,TRUE))
    DF.wave.3 <- data.frame(id=1:10,var.A=sample(letters[5:8],10,TRUE))

    nms = paste('wave',1:3,sep='.')
    dfs = list(DF.wave.1,DF.wave.2,DF.wave.3)     names(dfs) = nms

    changenm = function(nm){

        df = dfs[[nm]]
        wh = names(df) != 'id'
        names(df)[wh] = paste(names(df)[wh],nm,sep='.')
        df

    }

    Reduce(function(x,y)merge(x,y,by='id'),lapply(names(dfs),changenm))

On Mon, 25 Apr 2011, Eric Fail wrote:

> Is there anyone out there who can suggest a way to solve this problem?
>
> Thanks,
> Esben
>
> On Sun, Apr 24, 2011 at 8:53 PM, Jeff Newmiller
> <jdnewmil_at_dcn.davis.ca.us> wrote:
>> Merge only lets you combine two tables at a time, but it does have a
>> "suffix" argument that is intended to address your concern, but only for
>> variable names that would conflict.
>>
>> In your example, the id variables are all sequenced exactly the same, so you
>> could actually use cbind rather than merge.
>>
>> However, whether you use merge or cbind, I think the most direct route to
>> your desired result is to rename the data columns before you combine them,
>> using the names function on the left hand side of an assignment with a
>> vector of new names on the right.
>> ---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go Live...
>> DCN:<jdnewmil_at_dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
>> Live: OO#.. Dead: OO#.. Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>> ---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> Eric Fail <eric.fail_at_gmx.com> wrote:
>>>
>>> Dear R-list,
>>>
>>> Here is my simple question,
>>>
>>> I have n data frames that I would like to merge, but I can't figure out
>>> how to add information about the origin of the variable(s).
>>>
>>> Here is my problem,
>>>
>>> DF.wave.1 <- data.frame(id=1:10,var.A=sample(letters[1:4],10,TRUE))
>>> DF.wave.2 <- data.frame(id=1:10,var.M=sample(letters[5:8],10,TRUE))
>>> DF.wave.3 <- data.frame(id=1:10,var.A=sample(letters[5:8],10,TRUE))
>>>
>>> Now; I would like to merge the three dataframes into one, but append a
>>> suffix to the individual variables names about thir origin.
>>>
>>> DF.wave.all <- merge(DF.wave.1,DF.wave.2,DF.wave.3,by="id", [what to do
>>> here])
>>>
>>> In other words, I would like it to loook like this.
>>>
>>> DF.wave.all
>>> id var.A.wave.1 var.M.wave.2 var.A.wave.3
>>> 1 1 c h j
>>> 2 2 c e j
>>> 3 3 c g k
>>> 4 4 c e j
>>> 5 5 c g i
>>> 6 6 d e k
>>> 7 7 c h k
>>> 8 8 b g j
>>> 9 9 b f i
>>> 10 10 d h i
>>>
>>>
>>> Is there a command I can use directly in merge? 'suffixes' isn't really
>>> handy here.
>>>
>>> Thanks,
>>> Eric
>>> ________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html and provide commented, minimal,
>>> self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 25 Apr 2011 - 17:28:56 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Apr 2011 - 17:50:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive