Re: [R] Problem with binding data-frames

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue, 19 Jun 2007 11:19:23 +0100 (BST)

On Tue, 19 Jun 2007, Junnila, Jouni wrote:

> Hi,
>
> Yes, I'm aware that the problem is that I have differing number of
> columns in the different datasets. My question still remains. Is there
> some way I can allow column numbers to be different, or is there some
> other way combining these datasets?

Not with rbind. With merge, if you have a suitable index column. Using 2.5.1 beta:

> Day1 <- data.frame(x=1:2, id=1:2)
> Day2 <- data.frame(x=3:4, y=10:11, id=3:4)
> rbind(Day1, Day2)

Error in match.names(clabs, names(xi)) : names do not match previous names
> merge(Day1, Day2, all=TRUE)

   x id y
1 1 1 NA
2 2 2 NA
3 3 3 10
4 4 4 11

>
> Thanks,
> -Jouni
>
> On Mon, 18 Jun 2007, Petr Klasterecky wrote:
>
>> Junnila, Jouni napsal(a):
>>> Hello,
>>>
>>> I'm having a problem concerning r-binding datasets.
>>>
>>> I have six datasets, from six different plates, and two different
> days.
>>> I want to combine these datasets together for analysis. Datasets from
>
>>> day 2, have all the same columns than datasets from day 1. However in
>
>>> addition, there are few columns more in day 2. Thus, using rbind for
>>> this, results a error, because the objects are not the same length.
>>>
>>> Error in paste(nmi[nii == 0L], collapse = ", ") :
>>> object "nii" not found
>>> In addition: Warning message:
>>> longer object length
>>> is not a multiple of shorter object length in: clabs == nmi
>>
>> Hi,
>>
>> 1. the error has nothing to do with differing lengths of your objects
>> - that's what the following warning is about. The error occured
>> because your indexing object 'nii' does not exist where R is looking
> for it.
>
> It's because the dataframes have differing number of columns, and that
> has not been allowed for in the error message in that version of R.
>
>> 2. using rbind on dataframes is a bad practice, since the input is
>> converted to marices if possible. Use merge() instead.
>
> Not so: rbind on data frames does no such conversion, and is not
> problematic provided they have the same column names (and hence the same
> number of columns). You may have missed in ?rbind
>
> The functions 'cbind' and 'rbind' are S3 generic, with methods for
> data frames. The data frame method will be used if at least one
> argument is a data frame and the rest are vectors or matrices.
> ...
>
> and a later description of the data frame method for 'rbind'.
>
>>
>> Petr
>>
>>>
>>>
>>> What I need, is to combine all the six together, and give for example
>
>>> NA-value in day 1, for those columns which can only be found in day
> 2.
>>> Is this somehow possible?
>>>
>>> I have several of these six-datasets groups, and only few of them are
>
>>> having this problem described above, and I cannot know in advance
> which.
>>> With most of the groups writing
>>> rbind(data1,data2,data3,data4,data5,data6)
>>> works easily, but these few problematic groups need also to be
>>> combined...
>>> Any help greatly appreciated!
>>>
>>> -Jouni

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 19 Jun 2007 - 10:42:30 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 19 Jun 2007 - 11:32:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.