[Rd] R datasets ownership(copyright) and license

From: Yaroslav Halchenko <yarikoptic_at_gmail.com>
Date: Mon, 02 Apr 2012 18:06:14 -0400

Dear R Developers,

Recently filed (and dismissed ;) ) law suit by Astrolabe against tz database developers caused a lot of media-press and discussions and created some kind of precedence in the USA [3]. But also it imho showed that similar attacks might happen in the future, and possibly against data sets which are not that obviously "factual" thus after all might fall under copyright or IP protection if not in the states then in some other jurisdictions.

And 'data copyright/license' question comes over and over again, I just wanted to ask based on what policies or advisories datasets were selected to be shipped with R. From a very very brief look at the datasets, many of them appear to be factual data, thus at least at the moment probably are not copyrightable in the states -- but is there guarantee that they are not protected by copyright elsewhere if their origin abroad? But some seems to come from published works (still) under copyright with "All rights reserved", e.g. datasets Harman23 and Harman74 [4].

Although similar question to mine was raised before [e.g. 1,2] I have not found a straight answer e.g. from a list above or a mix of them:

  1. we simply did not look into it and adopted them with idea that if someone complains -- we remove corresponding pieces
  2. we considered all datasets factual data thus not copyrightable (in USA? around the globe?)
  3. for each (or some or majority) dataset we did collected information on possible copyright+license/IP holder and contacted them where unclear about the permission for reuse in a project under GPL license

Thank you in advance for the clarification!

P.S. Please do not take me wrong -- I am not trying to pick at anyone. I just wanted to get a better sense on the procedures/assumptions R developers use while adopting data for the R package, so that it could be of help for other projects.

[1] https://stat.ethz.ch/pipermail/r-help/2007-April/130422.html
[2] http://www.mail-archive.com/r-help@r-project.org/msg62486.html
[3] http://en.wikipedia.org/wiki/Tz_database
[4] it is interesting there that actual data comes from "unpublished PhD

    thesis", but once again from the U of Chicago who holds copyright     for the book itself.

Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik

R-devel_at_r-project.org mailing list
Received on Tue 03 Apr 2012 - 11:22:50 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 05 Apr 2012 - 14:10:39 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive