Re: [Rd] CRAN Server download statistics (Was: R Usage Statistics)

From: hadley wickham <h.wickham_at_gmail.com>
Date: Mon, 23 Nov 2009 08:48:01 -0600

> Knowing what percentage of different OSes are being used is of
> interest to package developers and would be obscured by the proposal
> to massage the data.  I prefer to see the raw figure as is.

I agree. I was arguing that sorting by that value wasn't very useful.

> Also the number of IPs are important and should not be removed in my
> opinion since (1) it is a measure of clustering.  If a package is
> mainly used by the courses of a few universities where the students
> really have no choice then that seems a lot different than if its used
> by a variety of people around the world.  Only the IPs would give any
> clue to that.  (2) it helps to diagnose intentional distortion of the
> figures by repeat downloads to the same machine.

There is no way to tease apart (1) and (2), plus many adsl providers share an ip across multiple subscribers. Number of unique IPs may still be useful, but it needs to be used with caution.

> The one problem with sparkline graphs is that it would take a lot
> longer for the page to load.  There already is a time series if you
> click on the package name.

Is it a time series? It looks like a bar chart of downloads per day of week to me.

Hadley

-- 
http://had.co.nz/

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 23 Nov 2009 - 15:06:08 GMT

This archive was generated by hypermail 2.2.0 : Mon 23 Nov 2009 - 15:10:36 GMT