Re: [Rd] Citation of R packages

From: John Maindonald <>
Date: Fri 10 Feb 2006 - 23:32:17 GMT

Even if a CITATION file is included, there is an issue of what to put in it.
Authorship of a book or paper is not always the simple matter that might appear. With an R package, it can be a far from simple matter. We are trying to adapt a tool, surely, that was designed for different purposes.

  1. I'd like to see the definition of a new BibTeX entry type that has fields for additional author details and version number. There is surely some mechanism for getting agreement on a new entry type.
  2. In any case, there's a message for maintainers of packages to include CITATION files that reflect what they want to appear in any citation, with citation("lattice") as maybe a suitable model?


John Maindonald email: phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Mathematical Sciences Institute, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.

On 11 Feb 2006, at 5:36 AM, wrote:

>>>>>> On Fri, 10 Feb 2006 21:01:44 +1100,
>>>>>> John Maindonald (JM) wrote:
> [...]
>> Where there is a published paper or a book (such as MASS), or a
>> manual for which a url can be given, my decision was to include
>> that in the main list of references, but not to include references
>> there that were references to the package itself, which as you
>> suggest below can be a reference to the concatenated help pages.
> The CITATION file of a package may contain as many entries as the
> author wants, including both a reference to the help pages and to the
> book (or whatever).
>> It seemed anyway useful to have a separate list of packages. For
>> consistency, these were always references to the package, with a
>> cross-reference to any relevant document in the references to papers.
>>>> (2) Maybe the author field should be more nuanced, or
>>>> maybe ...
>>> author fields of bibtex entries have a strict format (names
>>> separated
>>> by "and"), what do you mean by "more nuanced"?
>> Those named in the list of authors may be any combination of: the
>> authors
>> of an R package, the authors of an original S version, the person or
>> persons
>> responsible for an R port, the authors of the Fortran code, compiler
>> (s), and
>> contributors of ideas.
>> For John Fox's car, citation() gives the following:
>> author = {John Fox. I am grateful to Douglas Bates and David
>> Firth and Michael Friendly and Gregor Gorjanc and Georges Monette and
>> Henric Nilsson and Brian Ripley and Sanford Weisberg and and Achim
>> Zeleis for various suggestions and contributions.},
>> For Rcmdr:
>> author = {John Fox and with contributions from Michael Ash and
>> Philippe Grosjean and Martin Maechler and Dan Putler and and Peter
>> Wolf.},
>> For car, maybe John Fox should be identified as author. For Rcmdr,
>> maybe the other persons that are named should be added?
>> For leaps:
>> author = {Thomas Lumley using Fortran code by Alan Miller},
>> It seems reasonable to cite Lumley and Miller as authors. Should
>> there be a note that identifies Miller as the contributor of the
>> Fortran code?
>> Should the name(s) of porters (usually from S) be included as author
>> (s)? Or should their contribution be acknowledged in the note field?
>> Or ...
>> Possibilities are to cite all those individuals as author, or to cite
>> John Fox only,
>> with any combination of no additional information in the note field,
>> or using the
>> note field to explain who did what. The citation() function leaves
>> it unclear who
>> are to be acknowledged as authors, and in fact
> Umm, the problem there is not the citation() function, but that the
> authors of all those packages obviously have not included a CITATION
> file in their package which overrides the default (extracted from the
> E.g., package flexclust has DESCRIPTION
> Package: flexclust
> Version: 0.8-1
> Date: 2006-01-11
> Author: Friedrich Leisch, parts based on code by Evgenia Dimitriadou
> but
> ****
> R> citation("flexclust")
> To cite package flexclust in publications use:
> Friedrich Leisch. A Toolbox for K-Centroids Cluster Analysis.
> Computational Statistics and Data Analysis, 2006. Accepted for
> publication.
> A BibTeX entry for LaTeX users is
> @Article{,
> author = {Friedrich Leisch},
> title = {A Toolbox for K-Centroids Cluster Analysis},
> journal = {Computational Statistics and Data Analysis},
> year = {2006},
> note = {Accepted for publication},
> }
> ****
> because the CITATION file overrides the DESCRIPTION file. Writing a
> CITATION file is of course also intended for those cases where a
> proper reference cannot be auto-generated from the DESCRIPTION file.
>>>> (3) In compiling a list of packages, name order seems
>>>> preferable, and one wants the title first (achieved by
>>>> relocating the format.title field in the manual FUNCTION
>>>> in the .bst file
>>>> (4) manual seems not an ideal name for the class, if
>>>> there is no manual.
>>> A package always has a "reference manual", the concatenated help
>>> pages
>>> certainly qualify as such and can be downloaded in PDF format from
>>> CRAN. The ISBN rules even allow to assign an ISBN number to the
>>> online
>>> help of a software package which also can serve as the ISBN
>>> number of
>>> the *software itself* (which we did for base R).
>> I'd prefer some consistency in the way that R packages are
>> referenced.
>> Thus, if reference for one package is to the concatenated help pages,
>> do it that way for all of them.
> But we recommend that package authors should (try to) get their work
> into reviewed journals like JSS, JCGS, or CSDA, and then package
> authors usually prefer if the article gets cited. Unfortunately, many
> academic institutions value paper publications higher than software.
> Citing the help pages is mainly intended as a substitute if no journal
> article is available.
> Best,
> Fritz mailing list Received on Sat Feb 11 10:34:58 2006

This archive was generated by hypermail 2.1.8 : Mon 13 Feb 2006 - 14:59:18 GMT