Re: [R] dataframe operation

From: Marc Schwartz <marc_schwartz_at_comcast.net>
Date: Wed 24 Jan 2007 - 20:56:48 GMT

On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:

> > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > hi
> > > i have a dataframe "a" which looks like:
> > >
> > > column1, column2, column3
> > > 10,12, 0
> > > NA, 0,1
> > > 12,NA,50
> > >
> > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > >
> > > any idea i can do it?
> > >
> > > i appreciate any hint
> > > regards
> > > lukas
> > >
> >
> > Here is one possibility:
> >
> > > sapply(seq(along = colnames(DF)),
> > function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > [,1] [,2] [,3]
> > [1,] 10 12 0
> > [2,] 100 0 1
> > [3,] 12 200 50
> >
> >
> > Note that the returned object will be a matrix, so if you need a data
> > frame, just coerce the result with as.data.frame().

>
> OK....that's what I get for pulling the trigger too fast.
>
> Just reverse the logic in the function:
>
> > sapply(seq(along = colnames(DF)),
> function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> [,1] [,2] [,3]
> [1,] 100 200 300
> [2,] NA 200 300
> [3,] 100 NA 300
>
>
> I misread the query initially.

Here is another possibility, which may be faster depending upon the actual size and dims of your initial data frame.

Preallocate a matrix of replacement values:

Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),

              ncol = ncol(DF))

> Mat

     [,1] [,2] [,3]

[1,]  100  200  300
[2,]  100  200  300
[3,]  100  200  300


Now do the replacement:

> ifelse(!is.na(DF), Mat, NA)

  column1 column2 column3

1     100     200     300
2      NA     200     300
3     100      NA     300


In doing some testing, the above may be about 10 times faster than using sapply() in my first solution, again depending upon the structure of your DF.

HTH, Marc



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Jan 25 14:57:23 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 25 Jan 2007 - 05:30:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.