html
Dear Dan,
> Here is the problem that has come up repeatedly here:
>
> Users want an insane number of variables in panel estimation. It isn't
> enough to have a separate intercept for each county or hospital in the
> US, or each person in the PSID or NLSY. Stata can handle those directly
> or with some variation of taking out means. Now users want to have that
> plus to interact the dummies with other variables, which multiplies the
> number of variables in the regression, and which I don't know how to do
> without putting all the nuisance variables into the regression
equation.
> Any suggestions would be appreciated.
I found my notes on this yesterday, and there is a fairly simple
solution which you can use immediately. Say you want to estimate the
model
Y = X*B + Z*G_i
where X are the variables with constant coefficients, and Z are the
variables with different coefficients for each individual. You can get
the
estimate of B by using transformed Y and X, using residuals from
regressing Y and X on Z with individual coefficients. This can be done
easily in TSP with PANEL(BYID) , and perhaps it can be done in Stata as
well, if people prefer to use Stata. I programmed a PROC in TSP to
automate it - see the example below. (I'll send grunfeld.txt in a
separate email).
Sincerely,
Clint
--------------
options double crt;
name gcoefi 'coefficients varying by i';
? by Clint Cummins 4/06
? Data Source: Grunfeld (1958)
? Description: Panel Data, 10 U.S. firms over 20 years, 1935-1954.
? Variables:
doc FN 'Firm Number';
doc YR 'Year 1935-1954';
doc I 'Annual real gross investment';
doc F 'Real value of the firm (shares outstanding)';
doc K 'Real value of the capital stock';
list vars FN YR I F K;
const n,10 t,20 ystart,1935;
set nt=n*t;
smpl 1,nt;
read(file='grunfeld.txt') vars;
freq(panel,n=n,t=t,id=FN,time=YR,start=ystart) a;
? example with model:
? I = F*b + a_i + K*g_i
title 'direct results with OLSQ';
dummy fn;
dot 1-10;
k. = k*fn.;
enddot;
olsq i f fn1-fn10 k1-k10;
title 'results with transformed variables using panel(byid)';
? Method:
? Regress dependent variable and all RHS variables with nonvarying
? coefficients on the other variables, using panel(byid).
? The transformed variables are the residuals from these byid
regressions.
? In the example below, I is the dependent variables, and F is the
? only RHS variable with a constant coefficient.
? Then regress using the transformed variables, and make a d.f.
? correction to the SEs to reflect the extra estimated coefficients.
panel(noall,byid,silent) i c k;
i_t = @resi;
rename @coefi g_i;
panel(noall,byid,silent) f c k;
f_t = @resi;
rename @coefi g_f;
olsq i_t f_t;
title 'd.f. correction';
mat v = @vcov*(@nob-1)/(@nob-1-2*10);
tstats(names=@rnms) @coef v;
title 'nuisance coefficients';
mat gamma = g_i - g_f*@coef;
print gamma;
? Automated version, with PROC COEFI
list X f;
list Z c k;
Coefi i X Z B G RESI SSRI LOGLI 1;
print B G;
print SSRI LOGLI;
Proc Coefi Y X Z B G RESI SSRI LOGLI IFPRINT;
? Estimate the model Y = X*B + Z*G_i
? X and Z are lists of variables
? (Assumes FREQ(PANEL) is in effect)
? Also assumes no missing data.
? This version assumes that B are the primary coefficients of
? interest, so SEs for the G_i are not computed, although they
? could be.
? by Clint Cummins 4/06
? Create transformed variables and save coefs
local y_t g_y X_t;
panel(noall,byid,silent) y Z;
y_t = @resi;
rename @coefi g_y;
dot X;
local ._t g_.;
panel(noall,byid,silent) . Z;
._t = @resi;
rename @coefi g_.;
enddot;
local nz_ni;
? @AICI = -@LOGLI + NZ*NI, so:
set nz_ni = @AICI + @LOGLI;
list(suffix=_t) X_t X;
? Regression with transformed variables
olsq(silent) y_t X_t;
mat B = @COEF;
RESI = @RES;
set SSRI = @SSR;
set LOGLI = @LOGL;
? d.f. correction
if (IFPRINT); then; do;
local v;
mat v = @vcov*(@nob-@ncoef)/(@nob-@ncoef-nz_ni);
tstats(names=X) @coef v;
enddo;
? nuisance coefficients
mat G = g_y;
dot(index=j) X;
mat G = G - g_.*B(j);
enddot;
Endproc;