Re: [R] Memory Problems with a Simple Bootstrap

From: Tom La Bone <booboo_at_gforcecable.com>
Date: Fri, 01 Aug 2008 10:36:59 -0700 (PDT)

Same problem. The Windows Task Manager indicated that Rgui.exe was using 1,249,722 K of memory when the error occurred. This is R 2.7.1 by the way.

> library(boot)
> setwd("C:/Documents and Settings/Tom/Desktop")
>
> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
>
> per95 <- function( annual.data, b.index) {
+ sample.data <- annual.data[b.index,] + return(quantile(sample.data$Result,probs=c(0.95))) }
>
> m <- 10000
> for (i in 1:39) {

+   annual.data <- data.in[data.in$Year == (i+1949),]
+   B <- boot(data=annual.data,statistic=per95,R=m)
+   gc()
+   print(i)  
+   print(object.size(B))
+   print(memory.size())
+ }
[1] 1
[1] 90352
[1] 12.35335
[1] 2
[1] 111032
[1] 12.39024
[1] 3
[1] 155544
[1] 12.48451
[1] 4
[1] 159064
[1] 11.10526
[1] 5
[1] 243456
[1] 11.23505
[1] 6
[1] 280592
[1] 12.74642
[1] 7
[1] 302416
[1] 11.33087
[1] 8
[1] 319752
[1] 12.84377
[1] 9

[1] 351448
[1] 11.42264
Error: cannot allocate vector of size 284.4 Mb
>
>

jholtman wrote:
>
> Use gc() in the loop to possibly free up any fragmented memory. You
> might also print out the size of B (object.size(B)) since that appears
> to be the only variable in your loop that might be growing.
>
> On Fri, Aug 1, 2008 at 12:09 PM, Tom La Bone <booboo_at_gforcecable.com>
> wrote:

>>
>>
>> I have a data file called inputdata.csv that looks something like this"
>>
>>          ID     Year    Result Month   Date
>> 1       7174    1954   10            3          540301
>> 2       7174    1954    4            3          540322
>> 3       20924  1967     4           2          670223
>> 4       20924  1967   -7            5          670518
>> 5       20924  1967   -3            7          670706
>> ...
>> 67209 ...
>>
>> i.e., it goes on for 67209 rows (~2 Mb file). When I run the following
>> bootstrap session I get the indicated error:
>>
>>>
>>> library(boot)
>>> setwd("C:/Documents and Settings/Tom/Desktop")
>>>
>>> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
>>>
>>> per95 <- function( annual.data, b.index) {
>> +   sample.data <- annual.data[b.index,]
>> +   return(quantile(sample.data$Result,probs=c(0.95))) }
>>>
>>> m <- 10000
>>> for (i in 1:39) {
>> +   annual.data <- data.in[data.in$Year == (i+1949),]
>> +   B <- boot(data=annual.data,statistic=per95,R=m)
>> +   print(i)
>> +   print(memory.size())
>> + }
>> [1] 1
>> [1] 20.26163
>> [1] 2
>> [1] 61.6352
>> [1] 3
>> [1] 134.4187
>> [1] 4
>> [1] 149.4704
>> [1] 5
>> [1] 290.3090
>> [1] 6
>> [1] 376.7017
>> [1] 7
>> [1] 435.7683
>> [1] 8
>> [1] 463.7404
>> [1] 9
>> [1] 497.7946
>> Error: cannot allocate vector of size 568.8 Mb
>>>
>>
>> I am running this on a Windows XP Pro machine with 4 Gb of memory. The
>> same
>> problem occurs when the code is executed on the same box running Ubuntu
>> 8.04. Does anyone see any obvious reason why this should run out of
>> memory?
>> I would be happy to email the data file to anyone who cares to try it on
>> their computer.
>>
>> Tom
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18777897.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
-- 
View this message in context: http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18779433.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 01 Aug 2008 - 17:40:18 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 01 Aug 2008 - 19:33:03 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive