[R] ALARM!!!! Re: regarding large csv file import

From: <gyadav_at_ccilindia.co.in>
Date: Sat 28 Oct 2006 - 03:36:00 GMT

hi Jim,

if i partition the file, then for further operation like merging the partitioned files and after that doing some analysis on whole data set would again require the same amount of memory. If i am not able to do or if i am not having memory then i feel there should be serious thinking over the issue of memory handling.
hence i am also copying this to r-devel list and i would also would like to contribute and write code for memory handling issue. i would like to address this request to the great coders of R that software should be able to run in any amount of memory (except some minimum threshold...bingo). thus i would invite all the great coders to please address this issue and if in any ways i can be helpfull then i am right here.

thanks
with regards
-gaurav

"jim holtman" <jholtman@gmail.com>

27-10-06 09:09 PM

To
"gyadav@ccilindia.co.in" <gyadav@ccilindia.co.in>
cc

Subject
Re: [R] regarding large csv file import

Is the file only numeric, or does it also contain characters? You will get better performance by either using 'scan' , or specifying what the type of each column is with 'colClasses' so that read.csv does not have to guess at the types.  

You will probably need more memory depending on the type of data. If I assume that it is numeric and that it takes about 6 characters to specify a number, then you have approximately 45M numbers in the file and this will take up 362MB for a single object. You should have at least 3X the size of the largest object to do any processing since copies will have to be made.  

I would suggest partitioning the file and processing in parts. You can also put it in a database and 'sample' the rows that you want to process.  

On 10/27/06, gyadav@ccilindia.co.in <gyadav@ccilindia.co.in> wrote:

hi All,

i have a .csv of size 272 MB and a RAM of 512MB and working on windows XP. I am not able to import the csv file.
R hangs means it stops responding even SciViews hangs. i am using read.csv(FILENAME,sep=",",header=TRUE). Is there any way to import it.
i have tried archives already but i was not able to sense much.

thanks in advance

  Sayonara With Smile & With Warm Regards :-)

G a u r a v Y a d a v
Assistant Manager,
Economic Research & Surveillance Department, Clearing Corporation Of India Limited.

Address: 5th, 6th, 7th Floor, Trade Wing 'C', Kamala City, S.B. Marg, Mumbai - 400 013
Telephone(Office): - +91 022 6663 9398 , Mobile(Personal) (0)9821286118 Email(Office) :- gyadav@ccilindia.co.in , Email(Personal) :- emailtogauravyadav@gmail.com


DISCLAIMER AND CONFIDENTIALITY CAUTION:\ \ This message and ...{{dropped}}



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve? 


============================================================================================
DISCLAIMER AND CONFIDENTIALITY CAUTION:\ \ This message and ...{{dropped}}

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Oct 28 13:41:27 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 28 Oct 2006 - 07:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.