Re: [R] sem: negative parameter variances

From: John Fox <jfox_at_mcmaster.ca>
Date: Tue 18 Jul 2006 - 05:59:27 EST


Dear Denis,

I'm not in a position to comment on the spatial and time-series aspects of the data, though I'd be concerned about treating the observations as independent. (I understand that you have some way of accounting for spatial and temporal dependence.)

Except for the allowance that you've made for cross-equation correlated errors, this model could be fit by equation-by-equation OLS regression, and in its current form could be fit by equation-by-equation 2SLS. That FIML has run into numerical problems is a signal, however, that the combination of model and data are ill-conditioned in some way.

The output is a bit hard to follow because, for some reason, the table of parameter estimates, standard errors, etc., is ravelled in your email. Some of the parameter estimates are apparently very small numbers (I'm not sure why you're not seeing more significant digits in some values), and you might be able to make the computations more stable by changing the units of some of the variables. Eliminating the intercepts and working with the covariance matrix might also make the problem better conditioned.

"NaN" stands for "not a number," and is produced here when the summary method for sem objects tries to take the square-root of a negative variance; consequently, the standard error, z-statistic, and p-value are all undefined.

BTW, I count 15, not 13, exogenous variables (including the intercept).

I hope this helps,
 John



John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Denis Fomchenko
> Sent: Monday, July 17, 2006 1:40 PM
> To: John Fox; 'Spencer Graves'
> Cc: r-help@stat.math.ethz.ch
> Subject: [R] sem: negative parameter variances
>
> Dear Spencer and Prof. Fox,
>
>
>
> Thank you for your replies. I'll very appreciate, if you have
> any ideas concerning the problem described below.
>
>
>
> First, I'd like to describe the model in brief.
>
> In general I consider a model with three equations.
>
> First one is for annual GRP growth - in general it looks like:
>
>
>
> 1) GRP growth per capita = G(investment, migration, initial
> GRP per capita, spatial lag on GRP growth + some additional
> explanatory variables to control for regional disparities),
>
>
>
> where spatial lag on GRP growth in year t is simply spatial
> weights matrix W(n*n) (weights - are inverse square distances
> between regional centers; with zeros on the main diagonal)
> times GRP growth in year t, (n*1). I compute it manually for
> every year.
>
>
>
> Two others are for migration (labor supply) and investment
> (capital supply):
>
>
>
> 2) Migration = M(factors, explaining migration)
>
> 3) Investment = I(factors, explaining investment)
>
>
>
> I consider GRP growth, migration and investment to be
> endogenous and all others vars to be exogenous to the model.
> The data are over 77 regions (n=77) and 8 years (t =
> 1997..2004), so that I have total 616 observations.
>
>
>
> Actually, the final goal of the study is to estimate a model
> 1) simultaneously by FIML 2) with Mundlak (1978, 1981)
> specification of panels to capture for fixed and between
> effects 3) with spatial lag on GRP growth. As for spatial lag
> - the model needs to be estimated by maximum likelihood (as
> it is shown in Anselin 1988, for example)
>
> The main problems with Mundlak are increasing number of
> variables: as well as dimension of a model (additional
> between equations). So to begin with, I try to estimate pool,
> which is obviously easier, and proceed as follows:
>
>
>
> Consider the following specification:
>
>
>
> model.3eq.33 <- specify.model( )
>
> ln.inv.p0.pc ->
> ln.grp.phvi.pc , beta13 , NA
>
> ln.migr.new ->
> ln.grp.phvi.pc , beta12 , NA
>
> ln.grp.corr.pc97 ->
> ln.grp.phvi.pc , gamma11 , NA
>
> ln.sh.ind.rawwide ->
> ln.grp.phvi.pc , gamma12 , NA
>
> port ->
> ln.grp.phvi.pc , gamma13 , NA
>
> t12 ->
> ln.grp.phvi.pc , gamma14 , NA
>
> wd.ln.grp.phvi.pc ->
> ln.grp.phvi.pc , rho , NA
>
> (Intercept) ->
> ln.grp.phvi.pc , alpha1 , NA
>
> ln.income.pc.fcb97 -> ln.migr.new
> , gamma21 , NA
>
> ln.unempl.level97 -> ln.migr.new
> , gamma26 , NA
>
> avertemp.jan.cs -> ln.migr.new
> , gamma22 , NA
>
> ln.pop.gr.89.26mbe -> ln.migr.new
> , gamma23 , NA
>
> ln.pass.railway.percap -> ln.migr.new
> , gamma24 , NA
>
> ln.city ->
> ln.migr.new , gamma25 , NA
>
> (Intercept) ->
> ln.migr.new , alpha2 , NA
>
> ln.grp.corr.pc97 -> ln.inv.p0.pc
> , gamma31 , NA
>
> avertemp.jan.cs -> ln.inv.p0.pc
> , gamma32 , NA
>
> permafrost ->
> ln.inv.p0.pc , gamma33 , NA
>
> ln.indoutput.fuel.p0.pc -> ln.inv.p0.pc
> , gamma34 , NA
>
> ln.phone1995 -> ln.inv.p0.pc
> , gamma35 , NA
>
> (Intercept) ->
> ln.inv.p0.pc , alpha3 , NA
>
> ln.grp.phvi.pc <->
> ln.grp.phvi.pc , sigma11 , NA
>
> ln.grp.phvi.pc <-> ln.migr.new
> , sigma12 , NA
>
> ln.grp.phvi.pc <->
> ln.inv.p0.pc , sigma13 , NA
>
> ln.migr.new <-> ln.migr.new
> , sigma22 , NA
>
> ln.migr.new <->
> ln.inv.p0.pc , sigma23 , NA
>
> ln.inv.p0.pc <->
> ln.inv.p0.pc , sigma33 , NA
>
>
>
> So I consider a model with 3 equations, 3 endogenous and 13
> exogenous variables and 6 double-arrow relations for error covariance.
>
>
>
> Then, I use raw moments to capture (Intercept) in each equation:
>
>
>
> raw.3eq.33 <- raw.moments(~ ln.grp.phvi.pc + ln.inv.p0.pc +
> ln.migr.new + ln.grp.corr.pc97 + ln.sh.ind.rawwide + port +
> t12 + wd.ln.grp.phvi.pc + ln.income.pc.fcb97 +
> ln.unempl.level97 + avertemp.jan.cs + ln.pop.gr.89.26mbe +
> ln.pass.railway.percap + ln.city + permafrost +
> ln.indoutput.fuel.p0.pc + ln.phone1995, data=pool.spat)
>
>
>
> Next, I estimate model by sem-function (here, I declare fixed
> exogenous variables, including spatial lag
> 'wd.ln.grp.phvi.pc' ... though I am not sure about that)
>
>
>
> model.3eq.33.estim <- sem(model.3eq.33, raw.3eq.33, 616,
> fixed.x=c('ln.grp.corr.pc97', 'ln.sh.ind.rawwide', 'port',
> 't12', 'wd.ln.grp.phvi.pc', 'ln.income.pc.fcb97',
> 'ln.unempl.level97', 'avertemp.jan.cs', 'ln.pop.gr.89.26mbe',
> 'ln.pass.railway.percap', 'ln.city', 'permafrost',
> 'ln.indoutput.fuel.p0.pc', 'ln.phone1995', '(Intercept)'), raw=TRUE)
>
>
>
> Then, R returns:
>
>
>
> Warning message:
> Negative parameter variances.
> Model is probably underidentified.
> in: sem.default(ram = ram, S = S, N = N, param.names = pars,
> var.names = vars, :
>
>
>
> Obtained summary:
>
>
>
> summary(model.3eq.33.estim)
>
>
>
> Model fit to raw moment matrix.
>
> Model Chisquare = 398.81 Df = 24 Pr(>Chisq) = 0
>
> Goodness-of-fit index = 0.944
>
> Adjusted goodness-of-fit index = 0.60099
>
> RMSEA index = 0.15923 90 % CI: (0.14569, 0.17316)
>
> BIC = 175.29
>
> Normalized Residuals
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
>
> -0.9120 0.0000 0.0000 0.0268 0.0000 1.5600
>
>
>
> Parameter
> Estimate
> Std Error
> z value
> Pr(>|z|)
>
>
>
>
> beta13
> 0.0151
> 0.0031
> 4.9157
> 0.0000
> ln.grp.phvi.pc
> <---
> ln.inv.p0.pc
>
> beta12
> 0.9983
> 0.4831
> 2.0663
> 0.0388
> ln.grp.phvi.pc
> <---
> ln.migr.new
>
> gamma11
> -0.0268
> 0.0074
> -3.6420
> 0.0003
> ln.grp.phvi.pc
> <---
> ln.grp.corr.pc97
>
> gamma12
> 0.0120
> 0.0174
> 0.6867
> 0.4923
> ln.grp.phvi.pc
> <---
> ln.sh.ind.rawwide
>
> gamma13
> 0.0116
> 0.0062
> 1.8803
> 0.0601
> ln.grp.phvi.pc
> <---
> port
>
> gamma14
> -0.0063
> 0.0058
> -1.0996
> 0.2715
> ln.grp.phvi.pc
> <---
> t12
>
> rho
> 0.7542
> 0.0479
> 15.7449
> 0.0000
> ln.grp.phvi.pc
> <---
> wd.ln.grp.phvi.pc
>
> alpha1
> 0.1488
> 0.0667
> 2.2323
> 0.0256
> ln.grp.phvi.pc
> <---
> (Intercept)
>
> gamma25
> 0.0027
> 0.0013
> 2.1277
> 0.0334
> ln.migr.new
> <---
> ln.income.pc.fcb97
>
> gamma26
> 0.0038
> 0.0012
> 3.1503
> 0.0016
> ln.migr.new
> <---
> ln.unempl.level97
>
> gamma27
> 0.0003
> 0.0000
> 9.7510
> 0.0000
> ln.migr.new
> <---
> avertemp.jan.cs
>
> gamma28
> -0.0028
> 0.0003
> -9.6679
> 0.0000
> ln.migr.new
> <---
> ln.pop.gr.89.26mbe
>
> gamma29
> 0.0020
> 0.0009
> 2.2828
> 0.0224
> ln.migr.new
> <---
> ln.pass.railway.percap
>
> gamma210
> 0.0030
> 0.0004
> 8.2058
> 0.0000
> ln.migr.new
> <---
> ln.city
>
> alpha2
> -0.0244
> 0.0041
> -5.9237
> 0.0000
> ln.migr.new
> <---
> (Intercept)
>
> gamma31
> 0.7727
> 0.1070
> 7.2237
> 0.0000
> ln.inv.p0.pc
> <---
> ln.grp.corr.pc97
>
> gamma37
> 0.0123
> 0.0063
> 1.9596
> 0.0500
> ln.inv.p0.pc
> <---
> avertemp.jan.cs
>
> gamma311
> 0.1208
> 0.1118
> 1.0799
> 0.2802
> ln.inv.p0.pc
> <---
> permafrost
>
> gamma312
> 0.1805
> 0.0246
> 7.3497
> 0.0000
> ln.inv.p0.pc
> <---
> ln.indoutput.fuel.p0.pc
>
> gamma313
> 0.3888
> 0.1360
> 2.8591
> 0.0042
> ln.inv.p0.pc
> <---
> ln.phone1995
>
> alpha3
> -1.1867
> 0.8601
> -1.3798
> 0.1677
> ln.inv.p0.pc
> <---
> (Intercept)
>
> sigma11
> 0.0025
> 0.0001
> 17.3196
> 0.0000
> ln.grp.phvi.pc
> <-->
> ln.grp.phvi.pc
>
> sigma12
> 0.0000
> 0.0000
> -1.3206
> 0.1867
> ln.migr.new
> <-->
> ln.grp.phvi.pc
>
> sigma13
> 0.0001
> NaN
> NaN
> NaN
> ln.inv.p0.pc
> <-->
> ln.grp.phvi.pc
>
> sigma22
> 0.0000
> 0.0000
> 17.5513
> 0.0000
> ln.migr.new
> <-->
> ln.migr.new
>
> sigma23
> -0.0001
> 0.0001
> -0.5696
> 0.5690
> ln.inv.p0.pc
> <-->
> ln.migr.new
>
> sigma33
> 0.5859
> 0.0334
> 17.5470
> 0.0000
> ln.inv.p0.pc
> <-->
> ln.inv.p0.pc
>
>
>
>
> Iterations = 205
>
> Aliased parameters: sigma13
>
> Warning message:
>
> NaN in: sqrt(diag(object$cov))
>
>
>
> So R estimates the model with NaN for sigma13: I'm not sure
> what it means... As you can see, maximum is occurred at a
> negative value for sigma13. At the same time one can check
> for rank condition - if I'm not mistaken, all three equations
> are exact.
>
>
>
> After that, I estimate another specification, for example,
> including 'ln.postgrad.students.pc.be' (post-graduate
> students) instead of 't12' (dummy for poor and depressed
> regions), so that these two specifications differ only in one
> exogenous variable. I proceed in the same manner as before
> (including 'ln.postgrad.students.pc.be' instead of 't12' in
> specifying the model, computing raw moments and declaring new
> var to be fixed exogenous). So, I do not provide sem's code
> for the sake of space saving.
>
> Now R estimates the system "correctly", without any warning messages:
>
>
>
> Model fit to raw moment matrix.
>
> Model Chisquare = 405.73 Df = 24 Pr(>Chisq) = 0
>
> Goodness-of-fit index = 0.94308
>
> Adjusted goodness-of-fit index = 0.59441
>
> RMSEA index = 0.16069 90 % CI: (0.14715, 0.17462)
>
> BIC = 182.21
>
> Normalized Residuals
>
> Min. 1st Qu. Median Mean 3rd Qu. Max.
>
> -0.9210 0.0000 0.0000 0.0286 0.0000 1.6000
>
>
>
> Parameter
> Estimate
> Std Error
> z value
> Pr(>|z|)
>
>
>
>
> beta13
> 0.0157
> 0.0036
> 4.3162
> 0.0000
> ln.grp.phvi.pc
> <---
> ln.inv.p0.pc
>
> beta12
> 0.9143
> 0.5337
> 1.7132
> 0.0867
> ln.grp.phvi.pc
> <---
> ln.migr.new
>
> gamma11
> -0.0268
> 0.0068
> -3.9642
> 0.0001
> ln.grp.phvi.pc
> <---
> ln.grp.corr.pc97
>
> gamma12
> 0.0221
> 0.0171
> 1.2919
> 0.1964
> ln.grp.phvi.pc
> <---
> ln.sh.ind.rawwide
>
> gamma13
> 0.0141
> 0.0069
> 2.0499
> 0.0404
> ln.grp.phvi.pc
> <---
> port
>
> gamma14
> 0.0061
> 0.0032
> 1.8769
> 0.0605
> ln.grp.phvi.pc
> <---
> ln.postgrad.students.pc.be
>
> rho
> 0.7483
> 0.0480
> 15.5823
> 0.0000
> ln.grp.phvi.pc
> <---
> wd.ln.grp.phvi.pc
>
> alpha1
> 0.1433
> 0.0591
> 2.4256
> 0.0153
> ln.grp.phvi.pc
> <---
> (Intercept)
>
> gamma25
> 0.0026
> 0.0014
> 1.8682
> 0.0617
> ln.migr.new
> <---
> ln.income.pc.fcb97
>
> gamma26
> 0.0038
> 0.0013
> 2.8600
> 0.0042
> ln.migr.new
> <---
> ln.unempl.level97
>
> gamma27
> 0.0003
> 0.0000
> 9.3817
> 0.0000
> ln.migr.new
> <---
> avertemp.jan.cs
>
> gamma28
> -0.0028
> 0.0003
> -9.5709
> 0.0000
> ln.migr.new
> <---
> ln.pop.gr.89.26mbe
>
> gamma29
> 0.0021
> 0.0010
> 2.1621
> 0.0306
> ln.migr.new
> <---
> ln.pass.railway.percap
>
> gamma210
> 0.0030
> 0.0004
> 7.9666
> 0.0000
> ln.migr.new
> <---
> ln.city
>
> alpha2
> -0.0242
> 0.0042
> -5.7363
> 0.0000
> ln.migr.new
> <---
> (Intercept)
>
> gamma31
> 0.7733
> 0.1096
> 7.0574
> 0.0000
> ln.inv.p0.pc
> <---
> ln.grp.corr.pc97
>
> gamma37
> 0.0123
> 0.0067
> 1.8285
> 0.0675
> ln.inv.p0.pc
> <---
> avertemp.jan.cs
>
> gamma311
> 0.1209
> 0.1201
> 1.0065
> 0.3142
> ln.inv.p0.pc
> <---
> permafrost
>
> gamma312
> 0.1806
> 0.0250
> 7.2323
> 0.0000
> ln.inv.p0.pc
> <---
> ln.indoutput.fuel.p0.pc
>
> gamma313
> 0.3879
> 0.1379
> 2.8129
> 0.0049
> ln.inv.p0.pc
> <---
> ln.phone1995
>
> alpha3
> -1.1881
> 0.8754
> -1.3572
> 0.1747
> ln.inv.p0.pc
> <---
> (Intercept)
>
> sigma11
> 0.0025
> 0.0001
> 17.2039
> 0.0000
> ln.grp.phvi.pc
> <-->
> ln.grp.phvi.pc
>
> sigma12
> 0.0000
> 0.0000
> -1.1470
> 0.2514
> ln.migr.new
> <-->
> ln.grp.phvi.pc
>
> sigma13
> -0.0001
> 0.0010
> -0.0820
> 0.9346
> ln.inv.p0.pc
> <-->
> ln.grp.phvi.pc
>
> sigma22
> 0.0000
> 0.0000
> 17.5416
> 0.0000
> ln.migr.new
> <-->
> ln.migr.new
>
> sigma23
> -0.0001
> 0.0002
> -0.4366
> 0.6624
> ln.inv.p0.pc
> <-->
> ln.migr.new
>
> sigma33
> 0.5859
> 0.0334
> 17.5370
> 0.0000
> ln.inv.p0.pc
> <-->
> ln.inv.p0.pc
>
>
> Iterations = 137
>
>
>
> In fact, I've estimated about 30 specifications trying to
> experiment with different additional explanatory variables in
> the first equation, and about two thirds of 30 are turned out
> to be estimated with that warning message (Negative parameter
> variances with NaN for sigma's variances). I've not been
> clarified yet with the final specification... though I am not
> sure that's the case. Perhaps, I do something wrong
> concerning specification of error covariance structure.
>
>
>
> Do you have any idea? I'll be pleased by any suggestion.
>
>
>
> Sorry to trouble you.
>
> Thanks in advance,
>
>
>
> Denis Fomchenko
>
> Research fellow
>
> Department for Economic Development Problems
>
> Institute for the Economy in Transition
>
> 5, Gazetny lane, Moscow 125993, Russia
>
> e-mail: fomchenko@iet.ru
>
> http://www.iet.ru
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jul 18 06:04:28 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 18 Jul 2006 - 08:21:37 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.