From: Peter Dunn <dunn_at_usq.edu.au>

Date: Thu 06 Oct 2005 - 12:06:12 EST

system i386, linux-gnu

status

major 2

minor 1.0

year 2005

month 04

day 18

language R

>

Date: Thu 06 Oct 2005 - 12:06:12 EST

Hi all

I'm doing some things with a colleague comparing different sorts of models. My colleague has fitted a number of glms in Genstat (which I have never used), while the glm I have been using is only available for R.

He has a spreadsheet of fitted means from each of his models obtained from using the Genstat "predict" function. For example, suppose we fit the model of the type

glm.out <- glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) +

poly(X3,2), family=...)

Then he produces a table like this (made up, but similar):

F1(level1) 12.2 F1(level2) 14.2 F1(level3) 15.3 F2(level1) 10.3 F2(level2) 9.1 X1=0 10.2 X1=0.5 10.4 X1=1 10.4 X1=1.5 10.5 X1=2 10.9 X1=2.5 11.9 X1=3 11.8 X2=0 12.0 X2=0.5 12.2 X2=1 12.5 X2=1.5 12.9 X2=2 13.0 X2=2.5 13.1 X2=3 13.5

Each of the numbers are a predicted mean. So when X1=0, on average we predict an outcome of 10.2.

To obtain these figures in Genstat, he uses the Genstat "predict" function. When I asked for an explanation of how it was done (ie to make the "predictions", what values of the other covariates were used) I was told:

*> So, for a one-dimensional table of fitted means for any factor (or
**> variate), all other variates are set to their average values; and the
*

> factor constants (including the first, at zero) are given a weighted

> average depending on their respective numbers of observations.

So for quantitative variables (such as pH), one uses the mean pH in the data set when making the predictions. Reasonable anmd easy.

But for categorical variables (like Month), he implies we use a weighted average of the fitted coefficients for all the months, depending on the proportion of times those factor levels appear in the data.

(I hope I explained that OK...)

Is there an equivalent way in R or S-Plus of doing this? I have to do it for a number of sites and species, so an automated way would be useful. I have tried searching to no avail (but may not be searching on the correct terms), and tried hard-coding something myself as yet unsuccessfully: The poly terms and the use of the weighted averaging over the factor levels are proving a bit too much for my limited skills.

Any assistance appreciated. (Any clarification of what I mean can be provided if I have not been clear.)

Thanks, as always.

P.

> version

_

platform i386-pc-linux-gnu

arch i386 os linux-gnu

system i386, linux-gnu

status

major 2

minor 1.0

year 2005

month 04

day 18

language R

>

-- Dr Peter Dunn | Senior Lecturer in Statistics Faculty of Sciences, University of Southern Queensland Web: http://www.sci.usq.edu.au/staff/dunn Email: dunn <at> usq.edu.au CRICOS: QLD 00244B | NSW 02225M | VIC 02387D | WA 02521C ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Thu Oct 06 12:11:14 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:40:36 EST
*