Re: [R] Fwd: Documenting data sets with many variables

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Wed 17 Aug 2005 - 01:26:35 EST

On Tue, 2005-08-16 at 17:11 +0200, Arne Henningsen wrote:
> On Tuesday 16 August 2005 14:49, Roger D. Peng wrote:
> > Have you tried using 'promptData()' on the data frame and then
> > just using the resulting documentation file?
>
> Thank you, Roger, for bringing 'promptData()' to my mind. This is really a
> useful tool. However, in my special case my aim is to reduce the extent and
> increase the comprehensibility of the documentation rather than to reduce my
> effort to write the documentation.
>
> Any further hints are welcome!
>
> Thanks,
> Arne

Would it not be expedient then to ignore the \format{} section and just provide the information on the variables say in the \description{}, e.g.:

This example taken from package vegan describing 2 data.frames with 44 and 14 columns. Admittedly, none of the variables in the species dataset are explicitly and individually described in this example, but it is sufficient in this case I think.

\name{varespec}
\alias{varechem}
\alias{varespec}
\docType{data}
\title{Vegetation and environment in lichen pastures}
\usage{
       data(varechem)
       data(varespec)

}
\description{
  The \code{varespec} data frame has 24 rows and 44 columns. Columns   are estimated cover values of 44 species. The variable names are   formed from the scientific names, and are self explanatory for anybody   familiar with the vegetation type.
The \code{varechem} data frame has 24 rows and 14 columns, giving the soil characteristics of the very same sites as in the \code{varespec} data frame. The chemical measurements have obvious names. \code{Baresoil} gives the estimated cover of bare soil, \code{Humpdepth} the thickness of the humus layer.

}
....

HTH G

>
> > -roger
> >
> > Arne Henningsen wrote:
> > > Hi,
> > >
> > > since nobody answered to my first message, I try to explain my problem
> > > more clearly and more general this time:
> > >
> > > I have a data set in my R package "micEcon", which has many variables
> > > (82). Therefore, I would like to avoid to describe all variables in the
> > > "\format" section of the documentation (.Rd file). However, doing this
> > > lets "R CMD check" complain about "data codoc mismatches" (details see
> > > below). Is there a way to avoid the description of all variables without
> > > getting a complaint from "R CMD check"?
> > >
> > > Thanks,
> > > Arne
> > >
> > >
> > > ---------- Forwarded Message ----------
> > >
> > > Subject: Documenting data sets with many variables
> > > Date: Friday 05 August 2005 14:03
> > > From: Arne Henningsen <ahenningsen@email.uni-kiel.de>
> > > To: R-help@stat.math.ethz.ch
> > >
> > > Hi,
> > >
> > > I extended the data set "Blanciforti86" that is included in my R package
> > > "micEcon". For instance, I added consumer prices, annual consumption
> > > expenditures and expenditure shares of eleven aggregate commodity groups.
> > > The corresponding variables in the data frame are called "pAgg1",
> > > "pAgg2", ..., "pAgg11", "xAgg1", "xAgg2", ..., "xAgg11", "wAgg1",
> > > "wAgg2", ..., "wAgg11". To avoid to describe all 33 items in the
> > > "\format" section of the documentation (.Rd file) I wrote something like
> > >
> > > \format{
> > > This data frame contains the following columns:
> > > \describe{
> > > [ . . . ]
> > > \item{xAggX}{Expenditure on the aggregate commodity group X
> > > (in Millions of US-Dollars).}
> > > \item{pAggX}{Price index for the aggregate commodity group X
> > > (1972 = 100).}
> > > \item{wAggX}{Expenditure share of the aggregate commodity group X.}
> > > [ . . . ]
> > > }
> > > }
> > >
> > > and explained the 11 aggregate commodity groups only once in a different
> > > section (1=food, 2=clothing, ... ). However, "R CMD check" now complains
> > > about "data codoc mismatches", e.g.
> > > Code: [...] pAgg1pAgg2 pAgg3 [...]
> > > Docs: [...] pAggX [...]
> > >
> > > Is there a way to avoid the description of all 33 items without getting a
> > > complaint from "R CMD check"?
> > >
> > > Thanks,
> > > Arne
> > >
> > > -------------------------------------------------------
>

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [T] +44 (0)20 7679 5522
ENSIS Research Fellow             [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Aug 17 01:31:19 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:39:49 EST