From: Marc Schwartz <MSchwartz_at_mn.rr.com>

Date: Tue 16 May 2006 - 23:28:59 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 16 23:33:24 2006

Date: Tue 16 May 2006 - 23:28:59 EST

On Tue, 2006-05-16 at 09:45 +0200, Uwe Ligges wrote:

> Nameeta Lobo wrote:

*>
**> > Hello all
**> >
**> > thank you very much for all your suggestions. I actually need binary
**> > representations. I tried all the methods that Marc,Jim and Charles have
**> > suggested and they ran fine(thanks a lot). I tried doing it then with 26 and 13
**> > and that's when the computer gave way. I just got a message with all three
**> > methods that a vector of .....Kb cannot be allocated. guess I will have to
**> > change the environment to allow for huge vector size allocation. How do I do that?
**>
**>
**> You should have *at least* 512Mb in your machine for the solution given
**> by Charles C. Berry with the numbers given above, better a machine with 1Gb.
**>
**> Uwe Ligges
*

In addition to Uwe's comment, there are some practical issues that will apply here shortly if Nameeta continues to increase the size of the source vector:

- R has a limitation of 2^32 - 1 elements in a vector. This is the same for both 32 and 64 bit platforms. Thus, if Nameeta is planning to continue to expand the upper limit of the range, you will hit this fairly quickly. You would then need to consider some form of a partitioning approach if you go beyond that limit.
- The RAM requirements to simply apply Charles' solution will continue to expand as the upper limit increases, so Uwe's figure is but one number that solves the indicated example of 2^26, but will be insufficient beyond that.
- This still does not address Nameeta's now explicitly stated desire for the binary character representations, which requires additional memory beyond that required for the initial step of identifying the numbers that meet the 'bit requirements' alone.

*>From my prior post over the weekend, to store the character matrix of
*

binary representations for 2^25 with 9 bits, which contained 2,042,975
values, it required approximately 128 Mb for the final paste()'d
versions of the numbers.

That is AFTER doing the initial conversion using digitsBase(), which required 400 Mb to store the intermediate integer matrix result. One could certainly do that in a partitioned or loop based approach to conserve memory, but it still will hit practical limits in short order.

Those figures too will expand dramatically as the upper limit increases.

For example, going from 2^24 with 12 bits to 2^26 with 13 bits, results in going from 2,704,156 values in the result to 10,400,600 in the result. That's a 3.8 fold increase in the result vector size. It does not take long to figure out how much memory will be required for these operations as the upper range increases.

Depending upon what Nameeta is planning to do with the final resultant character vectors, one could consider a loop based print method/function that takes the values in the initial 'dec.index' vector and simply cat()'s them to some output. However, you would not be able to actually store them as a single matrix given the memory requirements.

Perhaps Nameeta can indicate what the primary problem is here, which might in turn allow someone to offer an alternative approach that is more resource sparing.

**HTH,
**
Marc Schwartz

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 16 23:33:24 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Wed 17 May 2006 - 02:10:02 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*