Re: [R] Re: [Rd] corrupt data frame: columns will be truncated or padded with NAs in:, digits = digits)

From: Prof Brian Ripley <>
Date: Tue 15 Feb 2005 - 00:23:48 EST

On Mon, 14 Feb 2005, Gregor GORJANC wrote:

> Sending this also to r-help so anyone can read it also there and maybe also
> help me with my puzzle if this trivial and I don't see it.

Please don't, and especially do not after having removed the context. So I have removed R-help from the follow-up.

> Prof Brian Ripley wrote:
> [... removed some ...]

The question I answered has been removed here, which is discourteous both to your helper and to your readers.

>> You add a column, not replace part of a non-existent column. Isn't that
>> obvious, given what you wrote?

Not if you subsequently remove what you wrote, of course.

> # OK. If I do
> tmp <- data.frame(y1=1:4, f1=factor(c("A", "B", "C", "D")))
> tmp[1:2, "y2"] <- 2
> tmp
> # I am changing nonexistent column y2 in data frame tmp.
> # If I do
> tmp <- data.frame(y1=1:4, f1=factor(c("A", "B", "C", "D")))
> tmp$y2 <- NA
> tmp[1:2, "y2"] <- 2
> tmp
> # I am changing existent column. I understand now the difference. However,
> # it is weird for me that this is OK (if column y2 does not yet exist)
> tmp["y2"] <- 2
> # but this is not
> tmp[1:2, "y2"] <- 2

What is `wierd' is your insistence that this makes sense. Columns in a data frame are required to be the same length. How is that supposed to be made up to the correct length? Possible for a numeric column with NAs, but not sensible for a raw column or a data frame column or ....

>> There is a lot of basic documentation on data manipulation in R/S, and a
>> whole chapter in MASS4. Somehow most other people don't seem to find this
>> a problem.
> I just ordered MASS4 last week and I am eager to get it in my hands. In
> meanwhile I read quite some documentation and what I more or less saw is
> tmp <- data.frame(y1=1:4, f1=factor(c("A", "B", "C", "D")))
> tmp$y2 <- 1:4
> tmp$y3 <- 2*tmp$y1
> ...
> ...
> i.e. everybody is adding full column to data frame. But I would like to add
> just one part.

But you cannot do so and not get a corrupt data frame. All you can hope for is to add a column and for something arbitrary to be added to your input to do so.

Brian D. Ripley,        
Professor of Applied Statistics,
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ mailing list
Received on Mon Feb 14 23:29:59 2005

This archive was generated by hypermail 2.1.8 : Tue 15 Feb 2005 - 01:27:53 EST