Re: [R] quantile regression problem

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Sun 11 Dec 2005 - 10:10:18 EST


On 10-Dec-05 zuzmun@natur.cuni.cz wrote:
> Dear List members,
>
> I would like to ask for advise on quantile regression in R.
>
> I am trying to perform an analysis of a relationship between
> species abundance and its habitat requirements -
> the habitat requirements are, however, codes - 0,1,2,3... where 0<1<2<3
> and the scale is linear - so I would be happy to treat them as
> continuos

As well as Roger Koenker's comments, you may also wish to consider the following.

(By the way, despite what you say above, you have "codes" at values 0, 0.5, 1, 1.5, 2. 3 -- is there anything special about the 0.5 and 1.5, or are they on the same footing as 0, 1, 2, 3? Also, I am curious as to why "habitat requirement" is named "absdeviation" in the data file. What does "habitat requirement" mean?).

> The analysis of the data somehow does not work, I am trying to
> perform linear quantile regression using rq function and I cannot
> figure out whether there is a way to analyse the data using quantile
> regression (I would really like to do this since the shape is an
> envelope) or whether it is not possible.

As Roger noted, the distribution of data is very variable over the values of "absdeviation":

absdeviation:       0      0.5    1      1.5    2      3 
Number of data:   673     15    493      3     19     20 
Total data: 1223

Therefore you chiefly have information about the cases "0" and "1".

I have loked at the data the opposite way round from you: For each value of "absdeviation" ("H" for "habitat in the following), consider the values of "abundance" (A).

For H=0 and H=1, the values of A are quite well approximated by a negative exponential distribution, thought the fit is better for H=1 than for H=0 -- in a more careful examination, I would try to emulate a for the continuous variable A a distribution inspired by the logarithmic distribution p(n) = (t^n)/(n*log(1-t)), n=0,1,2... which is a classic distribution for the probability that a species will be represented by n individuals in a sample of a large number of species whose different abundances are variable (Fisher, Corbett and Williams, and much later work).

The mean A for H=0 is m0 = 0.09389265 (n0=673), and the mean A for H=1 is m1 = 0.08407791 (n1=493).

with respective atandard deviations

  s0 = 0.1262238
  s1 = 0.08952975

on the basis of which

  (m0-m1)/(sqrt((s0^2)/n0 + (s1^2)/n1)) = 1.553156

which is not particularly large. While the histograms

  hist(A[H==0],breaks=0.02*(0:50),freq=FALSE)

and

  hist(A[H==1],breaks=0.02*(0:50),freq=FALSE)

do somewhat indicate a tendency for higher values of A to occur when H=0 than when H=1 there are only a few of these.

So on a first look, I am induced to conclude that there is little evidence in the two dominant data groups (H=0 and H=1) to indicate that these two groups differ. I doubt that the information for the H=0.5, H=1.5, H=2 anf H=3 would have more than a slight effect on this (though I have not looked on detail).

The corresponding means, however, are

  m0.5 = 0.1273273    (n = 15)
  m1.5 = 0.03003003   (n =  3)
  m2   = 0.02908183   (n = 19)
  m3   = 0.03830066   (n = 20)

which at first sight does suggest that, while m0.5 is similar to m0 and m1 above, m1.5 and m2 and m3 are distinctly smaller. However, for m1.5 this is based on a very small sample, and in any case the distribution of the raw values of A is so skew that the larger values of A occurring for H=0 and H=1 are unlikely to occur in such small samples.

Therefore, preliminary conclusion: I cannot see strong evidence of a relationship between "absdeviation" and abundance.

Hoping this is useful,
Best wishes,
Ted.

> I tested that if I replace the categories with continuous
> data of the same range it works perfectly. In the form I have
> them (and I cannot change it) I am getting errors - mainly
> about non-positive fis.
>
> Could somebody please let me know whether there was a way to
> analyse the data?
> The data are enclosed and the question is
> Is there a relationship between abundance and absdeviation?
> I am interested in the upperlimit so I wanted to analyze the upper 5%.
>
> Thanks a lot for your help
>
> All the best
>
> Zuzana Munzbergova
>
> www.natur.cuni.cz/~zuzmun



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 10-Dec-05                                       Time: 23:10:15
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Dec 11 10:32:35 2005

This archive was generated by hypermail 2.1.8 : Sun 11 Dec 2005 - 14:44:56 EST