Re: [R] Panel data - replicating Stata's xtpcse in R

From: Achim Zeileis <>
Date: Thu, 07 Apr 2011 21:56:26 +0200 (CEST)

On Thu, 7 Apr 2011, Florian Markowetz wrote:

> Dear list,
> I am trying to replicate an econometrics study that was orginally done in Stata. (Blanton and Blanton. 2009. A Sectoral Analysis of Human Rights and FDI: Does Industry Type Matter? International Studies Quarterley 53 (2):469 - 493.) The model I try to replicate is in Stata given as
> xtpcse total_FDI lag_total ciri human_cap worker_rts polity_4 market
> income econ_growth log_trade fix_dollar fixed_xr xr_fluct lab_growth
> english, pairwise corr(ar1)
> According to the paper, this is an OLS regression with panel corrected
> standard errors including a lagged dependent variable (lag_total is
> total_FDI t-1) and controlling first order correlations within each
> panel (corr(ar1)).

I'm not sure about the Stata command (because I haven't got Stata installed myself) and how it translates to R. Other people might know better.

>From the verbal description "OLS plus panel-corrected standard errors" I
would have expected that the coefficients could be estimated by lm() but that does not seem to be the case. Note sure doesn't seem to be _O_LS then. Did you check that the Stata command produces the same output as indicated in the paper? (Maybe some data preprocessing is necessary...?)

In any case, I've had success with replicating such results with the "plm" package (see also Typically using the model = "pooling" (i.e., OLS) and then computing the standard errors via vcovBK(). The latter stands for "Beck & Katz" which is what the "pcse" package also implements.

In a few other cases, I replicated the so-called panel-corrected standard errors via geeglm() from "geepack" ( Using the default corstr = "independence" (i.e., again correspond to OLS). Other corstr could be employed.

Just as additional information: Many econometricians don't know much about the type of models the "nlme" estimates. Usually, least squares technology is preferred in econometrics rather than likelihood-based ideas. Also, other multi-level models are rarely used. If specified in the same way, both approaches often yield similar results. There is a paragraph in the above-mentioned JSS paper on "plm" that discusses (dis)similarities with "nlme".

Finally, a JSS paper on the "pcse" package is also waiting for publication in a special volume...hopefully online next month.

Good luck with the replication!

> The BIG QUESTION is how to replicate this line in R.
> Econometrics is a new field to me, but a bit of searching showed that packages like plm, nlme, pcse should be able to handle this kind of problem. In particular, function gls() uses auto-correlation structure and pcse() corrects the standard errors of the fitted model. Below is some code to show what I have done, and some problems I ran into.
> ## setup and load data from web
> library(foreign)
> library(nlme)
> library(pcse)
> D <- read.dta("")
> D[544,"year"] <- 2005 ## fixing an unexpected NA in the year column
> ## Model formula
> form <- total_FDI ~ lag_total + ciri + human_cap + worker_rts + polity_4 + market_size + income + econ_growth + log_trade + fixed_xr + fix_dollar + xr_fluct + english + lab_growth
> ## Model 1: no auto-correlation
> res1 <- gls(model=form, data=D,correlation=NULL,na.action=na.omit)
> coefficients(res1)
> ## Model 2: with auto-correlation
> corr <- corAR1(.1,~1|c_name)
> res2 <- gls(model=form, data=D,correlation=corr,na.action=na.omit)
> coefficients(res2)
> Now, I know from the paper how the Stata coefficients looked like. For
> example, for log_total it should be .852 and for market_size .21 (these
> were the two significant ones). The result of Model1 is closer to this
> than the result of Model 2, but there is still quite a gap.
> The goal is to do OLS on panel data with AR(1) and PCSE - am I on the
> right track here? More specifically:
> Question 1: Auto-correlation - how to specify the parameter 'value' in
> corAR1 (the .1 above is completely arbitrary) - Any other ideas how to
> translate Stata's corr(AR1) into R? (I'm not even completely sure what
> Stata does there and didn't find any details in the online manuals)
> Question 2: PCSE - the pcse function seems to work on objects of class
> 'lm' only. Any way to use it for gls-objects?
> Any help is greatly appreciated!
> Florian
> --
> Florian Markowetz
> Cancer Research UK
> Cambridge Research Institute
> Li Ka Shing Centre
> Robinson Way, Cambridge, CB2 0RE, UK
> phone: +44 (0) 1223 40 4315
> email:
> web :
> skype: florian.markowetz
> This communication is from Cancer Research UK. Our website is at We are a registered charity in England and Wales (1089464) and in Scotland (SC041666) and a company limited by guarantee registered in England and Wales under number 4325234. Our registered address is Angel Building, 407 St John Street, London, EC1V 4AD. Our central telephone number is 020 7242 0200.
> This communication and any attachments contain information which is confidential and may also be privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it.
> E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses. We do not accept liability for any such matters or their consequences. Anyone who communicates with us by e-mail is taken to accept the risks in doing so.
> ______________________________________________
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
> mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Apr 2011 - 20:26:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Apr 2011 - 20:50:27 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive