Re: [R] problems creating data frames

From: John Kane <jrkrideau_at_yahoo.ca>
Date: Sat, 15 Mar 2008 12:27:20 -0400 (EDT)


Another approach might be to start with a list, pad the vectors and then convert to a data.frame. Here is something that Henrique Dallazuanna and Phil Spector suggested to me a couple of days ago as a way of solving something of a similar problem.

It seems to have the advantage that it will handle characters and factors as well.

worrylist = list(Behavior_Therapy = c(6, 7, 6, 5, 5, 5, 7, 8, 9, 6, 6, 7),

                          Atenolol = c(4, 8, 10, 3, 6,
5, 6, 3, 7, 5, 4, 6),
                          Placebo = c(0, 7, 0, 7, 0,
7))
                          

mlen = max(sapply(worrylist,length))
eqlens = lapply(worrylist,function(x)if(length(x) < mlen)                            

c(x,rep(NA,mlen-length(x))) else x)                             

kitbag <- do.call(data.frame, eqlens) ; kitbag

As for the names I think you're running into some kind of naming conventions in the assign statement. As a work around why not just call the vectors x,y,a and then do a names(mydata) <- c( "S+P-", and so on)

> This was just discussed this week:
>
>
https://stat.ethz.ch/pipermail/r-help/2008-March/157082.html
>
> On Fri, Mar 14, 2008 at 5:01 PM, Will Holcomb
> <wholcomb_at_gmail.com> wrote:
> > I am having two problems creating data frames that
> I have solutions, but
> > they really seem like kludges and I assume I just
> don't understand the
> > proper R way of doing things.
> >
> > The first situation is I have an set of uneven
> data vectors. When I try to
> > use them to create a data frame I would like the
> bottoms of them padded with
> > NAs, without explicitly specifying that. When I
> do:
> >
> > anxiety.data = data.frame(Behavior_Therapy = c(6,
> 7, 6, 5, 5, 5, 7, 8, 9, 6,
> > 6, 7),
> > Atenolol = c(4, 8, 10, 3,
> 6, 5, 6, 3, 7, 5, 4, 6),
> > Placebo = c(0, 7, 0, 7,
> 0, 7))
> >
> > It duplicates the values for Placebo twice. I can
> correct this by doing:
> >
> > anxiety.data = data.frame(Behavior_Therapy = c(6,
> 7, 6, 5, 5, 5, 7, 8, 9, 6,
> > 6, 7),
> > Atenolol = c(4, 8, 10, 3,
> 6, 5, 6, 3, 7, 5, 4, 6),
> > Placebo = c(c(0, 7, 0, 7,
> 0, 7), rep(NA, 6)))
> >
> > But this requires me to look at the length of the
> vectors and explicitly pad
> > them. Is there a method to say, "create a data
> frame and any vectors that
> > are too short should be padded with NAs"?
> >
> > My second situation has to do with the names of
> columns. When I do:
> >
> > rat.data = data.frame("S+/P+" =
> c(25,23,18,16,12,19,20,21),
> > "S+/P-" =
> c(18,17,16,11,14,15,21,12),
> > "S-/P+" =
> c(20,12,15,13,8,17,17,18),
> > "S-/P-" =
> c(12,15,17,10,18,10,9,14))
> >
> > I end up with the column names: S..P., S..P..1,
> S..P..2, and S..P..3. If I
> > rename them using:
> >
> > names(rat.data) = c("S+/P+", "S+/P-", "S-/P+",
> "S-/P-")
> >
> > Then I get them named what I attempted to name
> them initially. Is there some
> > way to just have them named what I attempted to
> name them in the first
> > place? I don't really know why they're being
> renamed since the names are
> > valid, so I'm not really sure what to try to find
> to correct the naming.
> >
> > Will
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 15 Mar 2008 - 16:31:29 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 15 Mar 2008 - 17:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive