Re: [R] R usage for log analysis

From: Gabriel Diaz <gabidiaz_at_gmail.com>
Date: Tue 13 Jun 2006 - 18:17:40 EST

Hello

thanks for the point, probably i saw DBMS as a need due to my ignorance about R.

If i can process all files like you said, i would not use DBMS as i prefer to keep it simple and easy to manage and run, less software dependencies the better.

thanks

gabi

On 6/12/06, bogdan romocea <br44114@gmail.com> wrote:
> I wouldn't use a DBMS at all -- it is not necessary and I don't see
> what you would get in return. Instead I would split very large log
> files into a number of pieces so that each piece fits in memory (see
> below for an example), then process them in a loop. See the list and
> the documentation if you have questions about how to read text files,
> count strings etc.
>
> #---split big files in two---
> for F in `ls *log`
> do
> fn=`echo $F | awk -F\. '{print $1}'`
> ln=`wc -l $F | awk '{print $1}'` #number of lines in the file
> forsplit=`expr $ln / 2 + 50` #no. of lines in each chunk, tweak as needed
> echo Splitting $F into pieces of $forsplit lines each........
> split -l $forsplit $F $fn
> done
>
>
> > -----Original Message-----
> > From: r-help-bounces@stat.math.ethz.ch
> > [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Gabriel Diaz
> > Sent: Monday, June 12, 2006 9:52 AM
> > To: Jean-Luc Fontaine
> > Cc: r-help@stat.math.ethz.ch
> > Subject: Re: [R] R usage for log analysis
> >
> > Hello
> >
> > Thanks all for the answers.
> >
> > I'm taking an overview to the project documentation, and seems the
> > database is the way to go to handle log files of GB order (normally
> > between 2 and 4 GB each 15 day dump).
> >
> > In this document http://cran.r-project.org/doc/manuals/R-data.html,
> > says R will load all data into memory to process it when using
> > read.table and such. Using a database will do the same? Well,
> > currently i have no machine with > 2 GB of memory.
> >
> > The moodss thing looks nice, thanks for the link. But what i have to
> > do now is an offline analysis of big log files :-). I will try to go
> > with the mysql -> R way.
> >
> > gabi
> >
> >
> >
> > On 6/12/06, Jean-Luc Fontaine <jfontain@free.fr> wrote:
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > Allen S. Rout wrote:
> > > >
> > > >
> > > > Don't expect a warm welcome. This community is like all
> > open-source
> > > > communities, sharply focused on its' own concerns and
> > expertise. And,
> > > > in an unusual experience for computer types, our core competencies
> > > > hold little or no sway here; they don't even give us much
> > of a leg up.
> > > > Just wait 'till you want to do something nutso like
> > produce a business
> > > > graphic. :)
> > > >
> > > > I'm working on understanding enough of R packaging and
> > documentation
> > > > to begin a 'task view' focused on systems administration,
> > for humble
> > > > submission. That might end up being mostly "log
> > analysis"; the term
> > > > can describe much of what we do, if it's stretched a bit.
> > I'm hoping
> > > > the task view will attract the teeming masses of
> > sysadmins trapped in
> > > > the mire of Gnuplot and friends.
> > > Although not specifically solving the problem at hand, you
> > might want
> > > to take a look at moodss and moomps
> > (http://moodss.sourceforge.net/),
> > > modular monitoring applications, which uses R
> > > (http://jfontain.free.fr/statistics.htm) and its log module
> > > (http://jfontain.free.fr/log/log.htm).
> > >
> > > - --
> > > Jean-Luc Fontaine http://jfontain.free.fr/
> > > -----BEGIN PGP SIGNATURE-----
> > > Version: GnuPG v1.4.3 (GNU/Linux)
> > > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
> > >
> > > iD8DBQFEjT2ykG/MMvcT1qQRAuF6AJ9nf5phV/GMmCHPuc5bVyA+SoXqGACgnLuZ
> > > u1tZpFOTCHNKOfFLZOC9uXI=
> > > =V8yo
> > > -----END PGP SIGNATURE-----
> > >
> > > ______________________________________________
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > >
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jun 13 18:26:37 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 13 Jun 2006 - 20:12:20 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.