[R] R Citation rates

From: John Maindonald <john.maindonald_at_anu.edu.au>
Date: Tue, 12 Aug 2008 15:48:50 +1000


Following some discussion with Simon Blomberg, I've done a few Web of Science citation searches. A topic search for R LANG* STAT* seems to turn up most of the references to
"R: A Language and Environment for Statistical Computing"
"R Development Core Team" gets transformed into an
astonishing variety of variations. Searching for citations of the 1996 Ihaka and Gentleman paper (most references up to and including 2004) turns up many fewer quirks.

What other forms of reference should be investigated?

Anyway, here are the numbers by year (there may a some duplication.
1998: I&G: 4 15 17 39 119 276
2004: RSTAT+I&G: 68+455 433+512 1049+426 1605+410 1389+255

                                   523         945           
1475           2015          1644

cit <- c("1998" = 4, "1999" = 15, "2000" = 17, "2001" = 39, "2002" = 119,
+ "2003" = 276,"2004" = 523,"2005" = 945,"2006" = 1475,
"2007" = 2015,

+ "2008"=1644)

[~4550 references to R LANG* STAT*; ~2530 to I&G)

On a rate per year basis, the 2008 figure scales up to 2691. This does not however allow for growth over the course of the year.

The number of references grew by 37% from 2006 to 2007. On current trends, the 2007-2008 increase seems likely to be much larger than that.

The figures probably underestimate the contribution from Bioconductor related work. A direct search for Bioconductor-related papers did not turn however up enough papers to make too much difference to the numbers.

Here are some other summary figures, for graphing using whatever form of presentation appeals most (the second number is for the I&G paper)

country <- c(usa=1540+903, germany=539+304, england=507+328,

                 france=468+337,  canada=345+147, australia=329+169,
                 switzerland=279+121)

subj <- c(ecology=924+349, statsANDprob=488+270, geneticsANDheredity=488+279,

           envScience=298+119, CSapplicatiions=269+108, zoology=267+111,
           plantSciences=250+108, biochemANDmolbio=229+200,
           mathANDcompBIO=224+143,  
biotechANDappliedmicrobiology=223+159,
           evolutionaryBIO=210+117)

There's a great deal more summary information that might be extracted. What is a good way, with readily available data, to standardize the country data.

Environmental Science no doubt comes up tops because it is a coarser grouping than many other areas.

John Maindonald email: john.maindonald_at_anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 12 Aug 2008 - 06:06:54 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 12 Aug 2008 - 06:33:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive