Re: [R] Comparing multiple distributions

From: Ravi Varadhan <>
Date: Thu, 31 May 2007 12:09:33 -0400

Your data is "compositional data". The R package "compositions" might be useful. You might also want to consult the book by J. Aitchison: statistical analysis of compositional data.


Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625



-----Original Message-----
[] On Behalf Of jiho Sent: Thursday, May 31, 2007 11:37 AM
To: R-help
Subject: Re: [R] Comparing multiple distributions

Nobody answered my first request. I am sorry if I did not explain my problem clearly. English is not my native language and statistical english is even more difficult. I'll try to summarize my issue in more appropriate statistical terms:

Each of my observations is not a single number but a vector of 5 proportions (which add up to 1 for each observation). I want to compare the "shape" of those vectors between two treatments (i.e. how the quantities are distributed between the 5 values in treatment A with respect to treatment B).

I was pointed to Hotelling T-squared. Does it seem appropriate? Are there other possibilities (I read many discussions about hotelling vs. manova but I could not see how any of those related to my particular case)?

Thank you very much in advance for your insights. See below for my earlier, more detailed, e-mail.

On 2007-May-21 , at 19:26 , jiho wrote:
> I am studying the vertical distribution of plankton and want to
> study its variations relatively to several factors (time of day,
> species, water column structure etc.). So my data is special in
> that, at each sampling site (each observation), I don't have *one*
> number, I have *several* numbers (abundance of organisms in each
> depth bin, I sample 5 depth bins) which describe a vertical
> distribution.
> Then let say I want to compare speciesA with speciesB, I would end
> up trying to compare a group of several distributions with another
> group of several distributions (where a "distribution" is a vector
> of 5 numbers: an abundance for each depth bin). Does anyone know
> how I could do this (with R obviously ;) )?
> Currently I kind of get around the problem and:
> - compute mean abundance per depth bin within each group and
> compare the two mean distributions with a ks.test but this
> obviously diminishes the power of the test (I only compare 5*2
> "observations")
> - restrict the information at each sampling site to the mean depth
> weighted by the abundance of the species of interest. This way I
> have one observation per station but I reduce the information to
> the mean depths while the actual repartition is important also.
> I know this is probably not directly R related but I have already
> searched around for solutions and solicited my local statistics
> expert... to no avail. So I hope that the stats' experts on this
> list will help me.
> Thank you very much in advance.



Ce message a iti virifii par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a iti trouvi.

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 31 May 2007 - 16:22:35 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 May 2007 - 16:31:31 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.