Re: [R] speeding up loop and dealing wtih memory problems

From: ONKELINX, Thierry <Thierry.ONKELINX_at_inbo.be>
Date: Mon, 28 Jul 2008 15:43:40 +0200

Dear Denise,

It looks like you want to replace all NA with 0 in the dataset? The code below should do that trick without loops. And it will be rather fast.

dat[is.na(dat)] <- 0

> dat <- matrix(rbinom(40, 1, 0.75), ncol = 4, nrow = 10)
> dat[dat == 0] <- NA
> dat
      [,1] [,2] [,3] [,4]

[1,] 1 1 1 1
[2,] 1 1 NA 1
[3,] NA 1 NA NA
[4,] 1 1 NA 1
[5,] 1 1 1 NA
[6,] 1 1 1 NA
[7,] 1 1 1 1
[8,] 1 1 1 NA
[9,] NA 1 1 1

[10,] 1 1 1 1
> 
> dat[is.na(dat)] <- 0
> dat
      [,1] [,2] [,3] [,4]

[1,] 1 1 1 1
[2,] 1 1 0 1
[3,] 0 1 0 0
[4,] 1 1 0 1
[5,] 1 1 1 0
[6,] 1 1 1 0
[7,] 1 1 1 1
[8,] 1 1 1 0
[9,] 0 1 1 1
[10,] 1 1 1 1
>

HTH, Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx_at_inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] Namens Denise Xifara
Verzonden: maandag 28 juli 2008 15:15
Aan: r-help_at_r-project.org
Onderwerp: [R] speeding up loop and dealing wtih memory problems

 Dear All and Mark,

Given a dataset that I have called dat, I was hoping to speed up the following loop:

for(i in 1:835353){
for(j in 1:86){
if (is.na(dat[i,j])==TRUE){dat[i,j]<-0 }}} Actually I am also having a memory problem. I get the following:

Error: cannot allocate vector of size 3.2 Mb In addition: Warning messages:
1: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size) 2: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size) 3: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size) 4: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size)

If I try and apply the loop just to a particular column, rather than the whole dataset, so that I dont have the memory problem, ie

for(i in 1:835353){
if (is.na(dat[i,4])==TRUE){dat[i,4]<-0 }}

it takes ridiculously long to process, so I was hoping that there would be a
quicker way to do this.

Thank you all very much for the help,
Denise

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 Jul 2008 - 13:53:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 Jul 2008 - 14:32:42 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive