Re: [R] mixtures as outcome variables

From: Kjetil Brinchmann Halvorsen <kjetil_at_acelerate.com>
Date: Thu 24 Mar 2005 - 03:36:41 EST

Jason W. Martinez wrote:

>Dear R-users,
>
>I have an outcome variable and I'm unsure about how to treat it. Any
>advice?
>
>I have spending data for each county in the state of California (N=58).
>Each county has been allocated money to spend on any one of the
>following four categories: A, B, C, and D.
>
>Each county may spend the money in any way they see fit. This also means
>that the county need not spend all the money that was allocated to them.
>The data structure looks something like the one below:

>
>COUNTY A B C D Total
>----------------------------------------------------
>alameda 2534221 1555592 2835475 3063249 9988537
>alpine 3174 8500 0 45558 55232
>amador 0 0 0 0 0
>....
>
>
>The goal is to explain variation in spending patterns, which are
>presumably the result of characteristics for each county.

>
>I may treat the problem like a simple linear regression problem for each
>category, but by definition, money spent in one category will take away
>the amount of money that can be spent in any other category---and each
>county is not allocated the same amount of money to spend.
>
>I have constructed proportions of amount spent on each category and have
>conducted quasibinomial regression, on each dependent outcome but that
>does not seem very convincing to me.
>
>Would anyone have any advice about how to treat an outcome variable of
>this sort?
>
>Thanks for any hints!
>
>Jason
>
>
>
>
>
>
>
If you only concentrate on the relative proportions, this are called compositional data. I f your data are in mydata (n x 4), you obtain compositions by sweep(mydata, 1, apply(mydata, 1, sum), "/")

There are not (AFAIK) specific functions/packages for R for compositional data AFAIK, but you
can try googling. Aitchison has a monography (Chapman & Hall) and a paper in JRSS B.

One way to start might be lm's or anova on the symmetric logratio transform of the
compositons. The R function lm can take a multivariate response, but some extra programming will be needed
for interpretation. With simulated data:

 > slr
function(y) { # y should sum to 1

          v <- log(y)
          return( v - mean(v) ) }

 > testdata <- matrix( rgamma(120, 2,3), 30, 4)  > str(testdata)
 num [1:30, 1:4] 0.200 0.414 0.311 2.145 0.233 ...  > comp <- sweep(testdata, 1, apply(testdata,1,sum), "/") # To get the symmetric logratio transform: comp <- t(apply(comp, 1, slr))
# Observe:
apply(cov(comp), 1, sum)
[1] -5.551115e-17 2.775558e-17 5.551115e-17 -2.775558e-17  > lm( comp ~ 1)

Call:
lm(formula = comp ~ 1)

Coefficients:

             [,1] [,2] [,3] [,4] (Intercept) 0.17606 0.06165 -0.03783 -0.19988

 > summary(lm( comp ~ 1))
Response Y1 :

Call:
lm(formula = Y1 ~ 1)

Residuals:

     Min 1Q Median 3Q Max -1.29004 -0.46725 -0.07657 0.55834 1.20551

Coefficients:

     Estimate Std. Error t value Pr(>|t|)
[1,]   0.1761     0.1265   1.391    0.175

Residual standard error: 0.6931 on 29 degrees of freedom

Response Y2 :

Call:
lm(formula = Y2 ~ 1)

Residuals:

    Min 1Q Median 3Q Max
-1.2982 -0.5711 -0.1355 0.5424 1.6598

Coefficients:

     Estimate Std. Error t value Pr(>|t|) [1,] 0.06165 0.15049 0.41 0.685

Residual standard error: 0.8242 on 29 degrees of freedom

Response Y3 :

Call:
lm(formula = Y3 ~ 1)

Residuals:

     Min 1Q Median 3Q Max -1.97529 -0.41115 0.03666 0.42785 0.88567

Coefficients:

     Estimate Std. Error t value Pr(>|t|) [1,] -0.03783 0.11623 -0.325 0.747

Residual standard error: 0.6366 on 29 degrees of freedom

Response Y4 :

Call:
lm(formula = Y4 ~ 1)

Residuals:

    Min 1Q Median 3Q Max
-2.8513 -0.3955 0.2815 0.5939 1.2475

Coefficients:

     Estimate Std. Error t value Pr(>|t|)
[1,]  -0.1999     0.1620  -1.234    0.227

Residual standard error: 0.8872 on 29 degrees of freedom

Sorry for not being of more help!

Kjetil

-- 

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
               --  Mahdi Elmandjra





-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Thu Mar 24 04:14:36 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:30:55 EST