Re: [R] dataframe operation

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Thu 25 Jan 2007 - 14:32:00 GMT

In conversing offline with Indermaur it seems that the elements of b are supposed to correspond to the rows rather than columns. In that case we can have the simpler solution:

0 * DF + b

On 1/24/07, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:
> Here is a slight variation on Marc's idea:
>
> isna <- is.na(DF)
> DF[] <- replace(100 * col(isna), isna, NA)
>
> On 1/24/07, Marc Schwartz <marc_schwartz@comcast.net> wrote:
> > On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> > > On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > > > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > > > hi
> > > > > i have a dataframe "a" which looks like:
> > > > >
> > > > > column1, column2, column3
> > > > > 10,12, 0
> > > > > NA, 0,1
> > > > > 12,NA,50
> > > > >
> > > > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > > > >
> > > > > any idea i can do it?
> > > > >
> > > > > i appreciate any hint
> > > > > regards
> > > > > lukas
> > > > >
> > > >
> > > > Here is one possibility:
> > > >
> > > > > sapply(seq(along = colnames(DF)),
> > > > function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > > > [,1] [,2] [,3]
> > > > [1,] 10 12 0
> > > > [2,] 100 0 1
> > > > [3,] 12 200 50
> > > >
> > > >
> > > > Note that the returned object will be a matrix, so if you need a data
> > > > frame, just coerce the result with as.data.frame().
> > >
> > > OK....that's what I get for pulling the trigger too fast.
> > >
> > > Just reverse the logic in the function:
> > >
> > > > sapply(seq(along = colnames(DF)),
> > > function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> > > [,1] [,2] [,3]
> > > [1,] 100 200 300
> > > [2,] NA 200 300
> > > [3,] 100 NA 300
> > >
> > >
> > > I misread the query initially.
> >
> > Here is another possibility, which may be faster depending upon the
> > actual size and dims of your initial data frame.
> >
> > Preallocate a matrix of replacement values:
> >
> > Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
> > ncol = ncol(DF))
> >
> > > Mat
> > [,1] [,2] [,3]
> > [1,] 100 200 300
> > [2,] 100 200 300
> > [3,] 100 200 300
> >
> >
> > Now do the replacement:
> >
> > > ifelse(!is.na(DF), Mat, NA)
> > column1 column2 column3
> > 1 100 200 300
> > 2 NA 200 300
> > 3 100 NA 300
> >
> >
> > In doing some testing, the above may be about 10 times faster than using
> > sapply() in my first solution, again depending upon the structure of
> > your DF.
> >
> > HTH,
> >
> > Marc
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Jan 26 01:48:31 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 25 Jan 2007 - 15:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.