[R] Summary: # of users of R, and biological examples of the use of R

About this list Date view Thread view Subject view Author view Other groups

Subject: [R] Summary: # of users of R, and biological examples of the use of R
From: Ramon Diaz-Uriarte (rdiazuri@students.wisc.edu)
Date: Sat 24 Jun 2000 - 20:13:20 EST


Message-Id: <00062414102000.00795@ligarto>

I got a lot of answers to the questions I posted; thank you very much to all
who responed. Since the two questions are different, I include here answers to
both, starting with the second question. The first question then generated some
more message (as just "# of users of R"); I include those too.

Ramón

##############################################################
################# Biological examples of the use of R ################
##############################################################

Question:
> 2) Have/are any of you using R in papers in the biological sciences
(specially evolutionary biology, ecology, behavior)?

Answers:

I have used it for a paper predicting defoliation by gypsy moths in the
Eastern US.

 -thomas

Thomas Lumley
Assistant Professor, Biostatistics
University of Washington, Seattle

*******************************

I'm working with a group of microbiologists for using it (for viral
evolution).

Papers won't be published for a bit, though, since it's only in the
beginning...

best,
-tony

-- 
A.J. Rossini                    Research Assistant Professor of Biostatistics 
Biostatistics/Univ. of Washington  (Th)  Box 357232   206-543-1044 (3286=fax)
Center for AIDS Research/HMC/UW    (M/F)         Box 359931   206-731-3647 (3693=fax)
VTN/SCHARP/FHCRC                  (Tu/W)         Box 358080   206-667-7025 (4812=fax)
rossini@(biostat.washington.edu|u.washington.edu|scharp.org)
http://www.biostat.washington.edu/~rossini
******************************************
Not yet because I have just been switching over from Splus for the past six
months or so.  But see my contributed package "maptree" on CRAN and references
for the datasets for examples of published work using Splus versions of what I
am now doing in R.

> Thanks, > > Ramon >

Denis White, US EPA, 200 SW 35th St, Corvallis, Oregon, 97333 USA voice: 541.754.4476, fax: 541.754.4716, email: white.denis@epa.gov web: www.epa.gov/wed/pages/staff/white/

********************************** Hang on... Well, I've started using R. There is also this paper: Tufto et al. (2000) Bayesian meta-analysis of demographic parameters in three small, temperate passerines. Oikos 88: 273-88.

This uses BUGS, and they credit R at the end (presumably they use it for processing the output). I'm not aware of other people using it, but it's not teh sort of thing one normally discusses at conferences.

Bob

-- Bob O'Hara Metapopulation Research Group Division of Population Biology Department of Ecology and Systematics PO Box 17 (Arkadiankatu 7) FIN-00014 University of Helsinki Finland

********************* I am using R for ecological and management work on freshwater fish populations. My uses of R include model fitting, maximum likelihood estimation, Bayesian analysis and other "stuff".

I am a post-doctoral fellow at the Fisheries Centre, University of British Columbia.

cheers, Andy

From: "Andrew J. Paul" <ajpaul@ucalgary.ca>To: rdiazuri@students.wisc.edu

*****

See Natalie Roberts talk. Very good presentation, but when I asked she said she could not share the data of her talk.

http://www.stat.Berkeley.EDU/users/terry/zarray/Html/Rintro.html

steve From: Stephen Arthur <sarthur@protogene.com> *****************

I have a paper accepted for print in Ecology with R macros to fit the models, and a few other manuscripts under processing with more casual application of R for statistics.

cheers, jari oksanen

From: Jari Oksanen <jhoksane@ecology.helsinki.fi>To: rdiazuri@students.wisc.edu ***********************

I am starting to use R for plant morphometrics (mostly multivariate analysis, plotting). Regards!

++++++++++++++++++++++++++++++++++ ] Zdenek Skala ] e-mail:skala@incoma.cz ] fax:++420-2-67311401 ] address: ] Nedvezska 2232 ] CZ-10000 Praha 10 ] Czech Republic

**************************************** Take a look at my web page www.luc.ac.be/~jlindsey/publications.html especially the recent paper in Applied Statistics on overdispersion which analyzes black grouse data. Jim

From: Jim Lindsey <james.lindsey@luc.ac.be>To: rdiazuri@students.wisc.edu ***********************

I have been using R to analyse the output of MLE and McMC algorithms. Most of the time I use coda and locfit but have also used many of the R functions that produce contour plots, etc. My field of research is population genetics and evolution and I know at least five or six colleagues here in the UK (U. of London, U. of Reading, etc.) who also use it. Also, I assume that most of the people working on coalescent theory and McMC methods are using it.

Cheers,

Oscar

Oscar E. Gaggiotti University of Cambridge Tel. +44 (0)1223 762934 Department of Zoology Downing Street Fax +44 (0)1223 336676 Cambridge CB2 3EJ E-mail: oeg20@cam.ac.uk United Kingdom --------- ********************************

I have. I only use R now for stats and for graphs too (except for drawings, or more complex layouts). I guess you will be interested in a paper I've just submitted to Am Nat on using GEEs for comparative analysis: I can send you a copy of the MS if you wish. I have also a paper on density dependence in bird populations where I have done some GLMs and modelling of variance (thanks to R), it is ready to be sent to J Anim Ecol.

... and I forgot to mention that I am writing a review for TREE on ""Advances in statistical modelling of variability and heterogeneity"; this will mention R of course.

Emmanuel Paradis <paradis@isem.univ-montp2.fr>

********************** One psychologist using R in the analysis of behavioral data. I hope that you don't get too many emails like this...

Jim From: Jim Lemon <bitwrit@ozemail.com.au>

************************

I've been using R for my recent work, of the last three papers I'vesubmitted this year, two contained statistical analsyes, and one used R (a lot)

I expect to submit another two this year, both make heavy use of R. One of these papers is the reason I started using R. I could think of no other way to analyse thousands of datasets produced by monte-carlo simulation, then analyse using ancova's - saving all the p-values for further analysis.

I'm also designing the computer lab exercises for a new, required, undergraduate biostatistics course - it will all be in R.

Peter L. Hurd, Ph.D. phurd@uts.cc.utexas.edu http://www.zo.utexas.edu/research/phurd fax 512.471-3878 Section of Integrative Biology, University of Texas, Austin TX 78712 USA

############################################################## ################# # of Users of R ################ ##############################################################

Question: 1) Anybody has any rough idea of how many people might be using R or how manypeople have downloaded R, or similar (I am aware answering this question might require divinatory powers...).

****************** I don't even know how many people are using it in my department. You canprobably find out how many people subscribe to the mailing lists, which gives a lower bound.

Thomas Lumley

****************

Without having such powers, I can report how the mailing lists look like :

% cat r-announce r-help r-devel |sort|uniq|wc -l 911 (w/o sort|uniq it's 1140); r-help alone has 635

which indicates that 911 different e-mail addresses are subscribed to the R mailing lists (very few of these are mailing lists; however, also quite a few will be pointing to the same person)

Now, the mailing lists probably contain almost no undergraduate students, and these *are* using R at least in many of the courses...

Further, graduate students and scientific staff in many organizations use R, but only the more "aficionados" among them are subscribed to an R list. [factor of 3 ?]

Other guesses?

[Then what would "uses R" mean at all? o >= 1 hour per week ? o (one of) your major tool(s) for statistical data analysis? o ?? ]

---- {Now some musings, don't take me too seriously: I'm tired, it's hot, ....}

At one moment in time I had dreamed of putting a "feature" into R which `registered' a user automatically when using R for the first time (by sending an e-mail to some "R counter" on the internet); but even if for a good purpose, it feels too much like "Big Brother" and "Virus/Worm"like behavior.. (and looks painful for me as MS-ignorant to work on a home Win-PC which only occasionally connects to the Net).

Linux has (had) an optional Linux Users counter, via nice web interface; however I think it had never reached a state where it counted more than a tiny fraction of users....

Martin Maechler

******************** Martin Maechler <maechler@stat.math.ethz.ch> writes: > which indicates that 911 different e-mail addresses are subscribed to > the R mailing lists (very few of these are mailing lists; > however, also quite a few will be pointing to the same > person) > > Now, the mailing lists probably contain almost no undergraduate students, > and these *are* using R at least in many of the courses... > > Further, graduate students and scientific staff in many organizations use R, > but only the more "aficionados" among them are subscribed to an R list. > [factor of 3 ?] > > Other guesses? >

Robert once guesstimated 10000 users, which would mean that roughly one in 10 signs up for mailing lists. That could well be the case, the mailing list traffic seems comparable to early days of s-news.

> [Then what would "uses R" mean at all? > o >= 1 hour per week ? > o (one of) your major tool(s) for statistical data analysis? > o ?? > ]

<We could also count the "sold items" since we're on both SuSE and RedHat CDs (and Debian but do their sales get counted?). Of course one thing is buying a program another is using it, but hey!, has that ever stopped others?>

> Linux has (had) an optional Linux Users counter, via nice web interface; > however I think it had never reached a state where it counted more than a > tiny fraction of users....

Estimated 1% it seems. It's still there (and has me as #4115 out of 148315) at counter.li.org.

-- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907

******************** On 20 Jun 2000, Peter Dalgaard BSA wrote: > Martin Maechler <maechler@stat.math.ethz.ch> writes: > > > which indicates that 911 different e-mail addresses are subscribed to > > the R mailing lists (very few of these are mailing lists; > > however, also quite a few will be pointing to the same > > person) > > > > Now, the mailing lists probably contain almost no undergraduate students, > > and these *are* using R at least in many of the courses... > > > > Further, graduate students and scientific staff in many organizations use R, > > but only the more "aficionados" among them are subscribed to an R list. > > [factor of 3 ?] > > > > Other guesses? > > > > Robert once guesstimated 10000 users, which would mean that roughly > one in 10 signs up for mailing lists. That could well be the case, the > mailing list traffic seems comparable to early days of s-news.

Sounds plausible, and the MathSoft estimated ratio is more like 30 (but then their user base will have a more hierarchical support structure). As I don't know who here is signed up to R-help (but I do for S-news) I am guessing a bit, but I'd say we had a 20:1 ratio (at uses R for several hours per year) in my dept for each R and S.

-- Brian D. Ripley, ripley@stats.ox.ac.uk **********************

On Tue, Jun 20, 2000 at 06:41:39PM +0200, Martin Maechler wrote: > which indicates that 911 different e-mail addresses are subscribed to > the R mailing lists (very few of these are mailing lists; > however, also quite a few will be pointing to the same > person) > > Now, the mailing lists probably contain almost no undergraduate students, > and these *are* using R at least in many of the courses... > > Further, graduate students and scientific staff in many organizations use R, > but only the more "aficionados" among them are subscribed to an R list. > [factor of 3 ?] > > Other guesses?

My 0.01Euro.

I have checked one moment ago. From my department, there are 2 subscriptions to r-announce and 1 (me) to r-help and r-devel. Thinking to the users, my estimate are (a) staff and post-graduate students: about 20 (b) under-graduate students: about 300 (many of these (50 to 100?) can be regarded as 'regular users', e.g., R installed on their home computer and used not only for the courses for which R is required).

Looking to the subscriptions from other Stats. department in Italy, I suspect that our ratio between subscribers and users is one of the highest.

guido

From: Guido Masarotto <guido@sirio.stat.unipd.it>

************************** OK - my 0.01UKP's worth ...

As statisticians we should above all be able to make a good estimate (with confidence intervals) of the R use. I see three methods for sizing:

1 Taking a sample from the list, which will itself be biased (probably a self-selected and therefore biased further) and asking these people to estimate/count how many users there are,

2 In a future version, storing automatically in /usr/lib/R or wherever a list of the users the first time they use R (ie when .R is set up). This number of unique entries on this list can then be requested via an email to the installers email address (requested at download time).

3 A snowball sample, starting with the present 911 list members, where people indicate (a) their applications area, (b) platform, (c...y) other information and (z) nominate other users by email address, returning this to a dedicated list address. The information is processed automatically, checked against already known addresses and any new addresses emailed with the questionnaire.

Method (1) seems to be under discussion at the moment, method (2) would only asymptote as people downloaded the new versions but is simplest and method (3) could provide useful further information re expert users etc since it would be from the actual users rather than possibly the sysadmin person.

This is not only of academic interest but could be used if necessary when looking for commercial sponsors, equipment, grants etc. The email addresses should of course be kept private - we wouldn't want them escaping into Outlook Express or a commercial list.

John From: j.logsdon@lancaster.ac.uk *********************** On Wed, 21 Jun 2000 j.logsdon@lancaster.ac.uk wrote: > Useful info from Dirk but this measures number of R Linux installations, > not users. If 100K is an approximate number of R Linux installations, > then the likely number of Linux R users could be somewhat more. In my > case, I have one installation and one user since only I use it but I am > sure that there are many cases with 2, 10, 100 etc users per installation. [...]

I guess that there also are some cases with several installations per user; I have seven installations; one sparc-linux, three i386-linux, and three Windows! On only five machines, though.

Göran From: gb <gb@stat.umu.se> ****************** OK, we're statisticians so let's use some real data and not only guestimates ... in

http://www.ci.tuwien.ac.at/~leisch/cran-http.report/

you find some usage statistics about the CRAN *master* site (with all traffic inside our domain removed). Beware that not every hit is a potential user as search engines (``crawlers'') heavily bias the log files. It's alo only data on our server, no cran.(ch|dk|uk|us|...) or statlib (with it's own mirrors). Also obviously all people using the version from their linux distribution are missing),

But it's something to play with :-)

Have fun, Fritz From: Friedrich Leisch <Friedrich.Leisch@ci.tuwien.ac.at> *********************** On Wed, 21 Jun 2000 j.logsdon@lancaster.ac.uk wrote:

> Useful info from Dirk but this measures number of R Linux installations, > not users. If 100K is an approximate number of R Linux installations, > then the likely number of Linux R users could be somewhat more. In my > case, I have one installation and one user since only I use it but I am > sure that there are many cases with 2, 10, 100 etc users per installation.

Also cases with fewer than 1 user, ie "Hmm. R looks like an interesting package. Maybe I should install it in case I need to do some statistics some time." or "We're a math department, we want all of the mathematical packages" or even people who install everything.

Obviously that last group will be overrepresented in the Debian popularity contest.

-thomas lumley

*********************** Just another random datapoint: Debian has a little known package called popularity-contest. When this optional package is present, and if the survey participation is enabled, a list of installed packages is emailed out. [1] The aggregated results are on http://www.debian.org/~apenwarr/popcon/

As of last night, it reported 735 participating hosts. Of these, 35 used R. (Select the 'math' section to fin the r-base package.)

With a few brave assumptions, [2] we get a guestimate of 107,000 R users on Linux alone. [3]

Dirk

[1] This is as anonymous as it can be given the constraints. A random md5sum hash is used to distinguish between the participating hosts and mail-headers are dropped as soon as possible.

[2] Let's assume that the ratio estimate is not biased and that Debian has 15% of the Linux installations which itself now stand at 15 million users.

[3] Mind the ~/.signature taken from fortune(1). I did say brave assumptions.

From: Dirk Eddelbuettel <edd@debian.org>

*************************

Re: [R] # of users of R Date: Thu, 22 Jun 2000 15:02:23 +0200 From: Emmanuel Paradis <paradis@isem.univ-montp2.fr> To: r-help@stat.math.ethz.ch Cc: cassaing@isem.isem.univ-montp2.fr, bentaleb@isem.isem.univ-montp2.fr, omoine@isem.isem.univ-montp2.fr, orth@isem.isem.univ-montp2.fr, vdebat@isem.isem.univ-montp2.fr, claude@isem.isem.univ-montp2.fr

At 17:51 21/06/00 +0200, Friedrich Leisch wrote: > >OK, we're statisticians so let's use some real data and not only >guestimates ... in > > http://www.ci.tuwien.ac.at/~leisch/cran-http.report/ > >you find some usage statistics about the CRAN *master* site (with all >traffic inside our domain removed). Beware that not every hit is a >potential user as search engines (``crawlers'') heavily bias the log >files. It's alo only data on our server, no cran.(ch|dk|uk|us|...) or >statlib (with it's own mirrors). Also obviously all people using the >version from their linux distribution are missing), > >But it's something to play with :-) > >Have fun, >Fritz

Looking at the data for March 2000 (coinciding with the release of R 1.0.0), there were 1800 hits on /bin/windows, and 732 hits on /bin/linux. I assume that ALL users of R on Linux visit /bin/linux (at least for curiosity to check what binaries are available) even if they use the sources (note that the same assumption must hold for Windows users).

So, 1800/732 estimates the ratio of the number of R users under Windows on the number of R users under Linux. Taking Dirk's estimate of 107,000 R users under Linux, I get 263,000 R users under Windows.

This still leaves open the issue of the number of users who just compile from the sources under other OSs... perhaps a few tens of thousands??? This would lead to ca.400,000 users of R.

Emmanuel Paradis ***************************

-- Ramón Díaz-Uriarte Dept. Zoology and Statistics University of Wisconsin-Madison Madison, WI 53706-1381

email: rdiazuri@students.wisc.edu (NOTE: starting 15-July-2000 new email: ramon-diaz@teleline.es) phone: 608-238-8041 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b25 : Mon 17 Jul 2000 - 12:33:24 EST