Re: [Rd] CRAN Server download statistics (Was: R Usage Statistics)

From: Gabor Grothendieck <>
Date: Mon, 23 Nov 2009 09:51:11 -0500

On Mon, Nov 23, 2009 at 9:48 AM, hadley wickham <> wrote:
>> Knowing what percentage of different OSes are being used is of
>> interest to package developers and would be obscured by the proposal
>> to massage the data.  I prefer to see the raw figure as is.
> I agree.  I was arguing that sorting by that value wasn't very useful.
>> Also the number of IPs are important and should not be removed in my
>> opinion since (1) it is a measure of clustering.  If a package is
>> mainly used by the courses of a few universities where the students
>> really have no choice then that seems a lot different than if its used
>> by a variety of people around the world.  Only the IPs would give any
>> clue to that.  (2) it helps to diagnose intentional distortion of the
>> figures by repeat downloads to the same machine.

> There is no way to tease apart (1) and (2), plus many adsl providers
> share an ip across multiple subscribers.  Number of unique IPs may
> still be useful, but it needs to be used with caution.
>> The one problem with sparkline graphs is that it would take a lot
>> longer for the page to load.  There already is a time series if you
>> click on the package name.

> Is it a time series?  It looks like a bar chart of downloads per day
> of week to me.

A time series is a function of time regardless of representation. mailing list Received on Mon 23 Nov 2009 - 14:58:16 GMT

This archive was generated by hypermail 2.2.0 : Mon 23 Nov 2009 - 15:10:36 GMT