Re: [R] dataframe operation

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed 24 Jan 2007 - 21:21:09 GMT

Here is a slight variation on Marc's idea:

isna <- is.na(DF)
DF[] <- replace(100 * col(isna), isna, NA)

On 1/24/07, Marc Schwartz <marc_schwartz@comcast.net> wrote:
> On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> > On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > > hi
> > > > i have a dataframe "a" which looks like:
> > > >
> > > > column1, column2, column3
> > > > 10,12, 0
> > > > NA, 0,1
> > > > 12,NA,50
> > > >
> > > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > > >
> > > > any idea i can do it?
> > > >
> > > > i appreciate any hint
> > > > regards
> > > > lukas
> > > >
> > >
> > > Here is one possibility:
> > >
> > > > sapply(seq(along = colnames(DF)),
> > > function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > > [,1] [,2] [,3]
> > > [1,] 10 12 0
> > > [2,] 100 0 1
> > > [3,] 12 200 50
> > >
> > >
> > > Note that the returned object will be a matrix, so if you need a data
> > > frame, just coerce the result with as.data.frame().
> >
> > OK....that's what I get for pulling the trigger too fast.
> >
> > Just reverse the logic in the function:
> >
> > > sapply(seq(along = colnames(DF)),
> > function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> > [,1] [,2] [,3]
> > [1,] 100 200 300
> > [2,] NA 200 300
> > [3,] 100 NA 300
> >
> >
> > I misread the query initially.

>

> Here is another possibility, which may be faster depending upon the
> actual size and dims of your initial data frame.
>

> Preallocate a matrix of replacement values:
>

> Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
> ncol = ncol(DF))
>
> > Mat
> [,1] [,2] [,3]
> [1,] 100 200 300
> [2,] 100 200 300
> [3,] 100 200 300
>
>

> Now do the replacement:
>

> > ifelse(!is.na(DF), Mat, NA)
> column1 column2 column3
> 1 100 200 300
> 2 NA 200 300
> 3 100 NA 300
>
>

> In doing some testing, the above may be about 10 times faster than using
> sapply() in my first solution, again depending upon the structure of
> your DF.
>

> HTH,
>

> Marc
>

> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Jan 25 15:34:49 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 25 Jan 2007 - 15:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.