Re: [Rd] R license for a derived data-only package

From: Simon Urbanek <>
Date: Fri, 16 Sep 2011 10:50:17 -0400

On Sep 16, 2011, at 10:32 AM, Michael Friendly wrote:

> I'm looking for guidance or advice about the R license to use in preparing a package containing the
> Baseball Database from
> My main purpose is to make it available to students in a course, and to develop it with others
> I'd like to put it on R-Forge, and then perhaps make it public on CRAN.
> However, the page above bears a very restrictive copyright notice and limited license:
> This database is copyright 1996-2010 by Sean Lahman. A license is granted
> for individual use for research purposes. It may not be re-distributed
> without permission. Any commercial use, or other dissemination of the
> database in part or in whole is prohibited. Use of this database
> constitutes acceptance of these terms.
> I've written several times to the author asking permission for my intended wider use, but have
> received no reply.
> What makes this perplexing is that I am apparently free to "distribute" this by sending links
> in an email or posting them on a web page, so that others actually download them for
> personal use. The R package, however would be considered a "derived work", I think,
> since it contains .RData files I created and .Rd documentation. Does the original
> limited license apply to this?

The way people have dealt with this in the past is to create a package that displays the license and downloads the data. The way I read it (but I am not a lawyer and the wording is very ambiguous) you cannot redistribute it in any form (not even in original form) so the only way to obtain it is to download in from the site. This also implies that the conversion to .RData has to be done at (or after) install time from the download and can't be done at build time.

This does not constitute a legal advice, it is just my personal opinion.


> AFAICS, none of the R licenses described at:
> seem to cover this situation, although they seem to apply to the R package, not the
> data on which it is based.
> The TeX archive CTAN defines a wider range of licenses, including a bunch of non-free ones,
> But I don't know if any of these are acceptable in R packages (e.g., will pass R CMD check).
> I'd rather not have to consult a lawyer, so any guidance is welcome.
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street Web:
> Toronto, ONT M3J 1P3 CANADA
> ______________________________________________
> mailing list
> mailing list Received on Fri 16 Sep 2011 - 14:53:52 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 16 Sep 2011 - 15:50:31 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive