Re: [R] Newbie: Using R to analyse Apache logs

From: Raj Mathur <raju_at_linux-delhi.org>
Date: Fri, 1 Feb 2008 08:45:24 +0530

hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO X-USF-Spam-Flag: NO

Hi Kevin,

On Thursday 31 Jan 2008, Zembower, Kevin wrote:
> Raj,
>
> I've been experimenting with R to compute simple statistics from my web
> logs somewhat similar to what you're describing. For instance, I'm
> working on trying to classify a unique IP or domain name requestor as
> 'human' or 'robot' based on the number of seconds between requests for
> pages. I've found that the easiest method of work, given my (elementary)
> knowledge of R and my (professional) knowledge of perl, is to run my
> logs through a perl program to pre-process the data, before submitting
> it to R. The output of running my Apache web log through my perl program
> looks like this tab-delimited output:
> [snip]

Coincidentally I was planning to write a Perl script before it struck me that R could probably do this job better. I'd be glad to have whatever work you've done so far and see if I can tune it -- try to get some help from my academic friends. If that doesn't work, *shrug* it's back to Perl :)

Regards,


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 01 Feb 2008 - 03:22:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 01 Feb 2008 - 03:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive