# Re: [R] Fwd: Documenting data sets with many variables

From: Arne Henningsen <ahenningsen_at_email.uni-kiel.de>
Date: Thu 18 Aug 2005 - 00:48:47 EST

On Tuesday 16 August 2005 17:26, Gavin Simpson wrote:
> On Tue, 2005-08-16 at 17:11 +0200, Arne Henningsen wrote:
> > On Tuesday 16 August 2005 14:49, Roger D. Peng wrote:
> > > Have you tried using 'promptData()' on the data frame and then
> > > just using the resulting documentation file?
> >
> > Thank you, Roger, for bringing 'promptData()' to my mind. This is really
> > a useful tool. However, in my special case my aim is to reduce the extent
> > and increase the comprehensibility of the documentation rather than to
> > reduce my effort to write the documentation.
> >
> > Any further hints are welcome!
> >
> > Thanks,
> > Arne
>
> Would it not be expedient then to ignore the \format{} section and just
> provide the information on the variables say in the \description{},
> e.g.:

That's a great idea - and so simple!
This perfectly solves my problem.
Thanks,
Arne

> This example taken from package vegan describing 2 data.frames with 44
> and 14 columns. Admittedly, none of the variables in the species dataset
> are explicitly and individually described in this example, but it is
> sufficient in this case I think.
>
> \name{varespec}
> \alias{varechem}
> \alias{varespec}
> \docType{data}
> \title{Vegetation and environment in lichen pastures}
> \usage{
> data(varechem)
> data(varespec)
> }
> \description{
> The \code{varespec} data frame has 24 rows and 44 columns. Columns
> are estimated cover values of 44 species. The variable names are
> formed from the scientific names, and are self explanatory for anybody
> familiar with the vegetation type.
> The \code{varechem} data frame has 24 rows and 14 columns, giving the
> soil characteristics of the very same sites as in the \code{varespec}
> data frame. The chemical measurements have obvious names.
> \code{Baresoil} gives the estimated cover of bare soil, \code{Humpdepth}
> the thickness of the humus layer.
>
> }
> ....
>
> HTH
>
> G
>
> > > -roger
> > >
> > > Arne Henningsen wrote:
> > > > Hi,
> > > >
> > > > since nobody answered to my first message, I try to explain my
> > > > problem more clearly and more general this time:
> > > >
> > > > I have a data set in my R package "micEcon", which has many variables
> > > > (82). Therefore, I would like to avoid to describe all variables in
> > > > the "\format" section of the documentation (.Rd file). However, doing
> > > > this lets "R CMD check" complain about "data codoc mismatches"
> > > > (details see below). Is there a way to avoid the description of all
> > > > variables without getting a complaint from "R CMD check"?
> > > >
> > > > Thanks,
> > > > Arne
> > > >
> > > >
> > > > ---------- Forwarded Message ----------
> > > >
> > > > Subject: Documenting data sets with many variables
> > > > Date: Friday 05 August 2005 14:03
> > > > From: Arne Henningsen <ahenningsen@email.uni-kiel.de>
> > > > To: R-help@stat.math.ethz.ch
> > > >
> > > > Hi,
> > > >
> > > > I extended the data set "Blanciforti86" that is included in my R
> > > > package "micEcon". For instance, I added consumer prices, annual
> > > > consumption expenditures and expenditure shares of eleven aggregate
> > > > commodity groups. The corresponding variables in the data frame are
> > > > called "pAgg1", "pAgg2", ..., "pAgg11", "xAgg1", "xAgg2", ...,
> > > > "xAgg11", "wAgg1", "wAgg2", ..., "wAgg11". To avoid to describe all
> > > > 33 items in the "\format" section of the documentation (.Rd file) I
> > > > wrote something like
> > > >
> > > > \format{
> > > > This data frame contains the following columns:
> > > > \describe{
> > > > [ . . . ]
> > > > \item{xAggX}{Expenditure on the aggregate commodity group X
> > > > (in Millions of US-Dollars).}
> > > > \item{pAggX}{Price index for the aggregate commodity group X
> > > > (1972 = 100).}
> > > > \item{wAggX}{Expenditure share of the aggregate commodity group
> > > > X.} [ . . . ]
> > > > }
> > > > }
> > > >
> > > > and explained the 11 aggregate commodity groups only once in a
> > > > different section (1=food, 2=clothing, ... ). However, "R CMD check"
> > > > now complains about "data codoc mismatches", e.g.
> > > > Code: [...] pAgg1pAgg2 pAgg3 [...]
> > > > Docs: [...] pAggX [...]
> > > >
> > > > Is there a way to avoid the description of all 33 items without
> > > > getting a complaint from "R CMD check"?
> > > >
> > > > Thanks,
> > > > Arne
> > > >
> > > > -------------------------------------------------------

--
Arne Henningsen
Department of Agricultural Economics
University of Kiel
Olshausenstr. 40
D-24098 Kiel (Germany)
Tel: +49-431-880 4445
Fax: +49-431-880 1397
ahenningsen@agric-econ.uni-kiel.de
http://www.uni-kiel.de/agrarpol/ahenningsen/

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help