{smcl}
{* 3 August 2012}
{hline}
help for {it:{hi:tsls}}
{hline}
{title: Fast and Small 2SLS with FE, IV and Clustered SE}
{title:Description}
{pstd}
{cmdab:tsls} {depvar} {indepvars}
{cmd:(}{it:varlist2} {cmd:=} {it:varlist_iv}{cmd:)}
{cmdab:fe(panelid)}
[,
{cmdab:a:reg}
{cmdab:c:luster(clusterid)}
{cmdab:d:emean}
{cmdab:r:eplace}]
{pstd}This procedure does two-stage least squares with fixed effects,
instumental variables and clustered standard errors. While not covering all
the capabilities of {cmd:xtivreg2} or {cmd:ivregress} it is memory efficient
and is many times faster. Coeficients and standard errors are unaffected. It
is intended for datasets with hundreds of millions of observations and
hundreds of variables and for users with time for a bit of care
and preparation.
{title:Options}
{pstd} {opt areg} {opt fe(panelid)} must also be specified. Use the
{cmd:areg} instead of the {cmd: regress} procedure for the second stage
regression, absorbing {it:panelid} with means calculated on-the-fly. This
option is incompatible (and unnecessary) with {opt demean} and {opt replace}.
See notes below. Standard errors are corrected to match
{cmd:xtivreg}. {p_end}
{pstd}{opt cluster(clusterid)} Cluster standard errors by {it:clusterid},
which may be different from {it:panelid}.
{p_end}
{pstd}{opt demean} Demean the variables by {opt fe(panelid)} before running
the regression. This is incompatible and unneccessary with {opt areg}. If
{opt replace} is specified the demeaning is done in place and the original
data is overwritten. This reduces the memory load and if you have multiple
regressions with overlapping variables it is efficient to include all your
variables in an initial regression with {opt demean} and then subsequent
regressions with only the {opt fe(panelid)}. The first regression will drop
rows with missing data, and subsequent regressions will be from the same
subsample. Note that if you add an un-demeaned variable in one of the
subsequent regressions, there will be no error message but the result will
be wrong. {p_end}
{pstd}{opt fe(panelid)} Required. Specify the variable identifying panel units.
If {opt demean} is not specified this only affects the degrees of freedom.
{pstd}{opt replace} Used with {opt demean} to cause variables listed in
the regression to be replaced with their own deviations from panel unit means.
{title:Examples}
{pstd}Fixed effects with a storage constraint and clustered errors. This doesn't
affect the data. {p_end}
{phang2} {cmd:. tsls y1 y2,areg fe(panelid) cluster(clusterid) } {p_end}
{pstd} Fixed effect and instrumental variable but the original data is overwritten. {p_end}
{phang2}{cmd:. preserve} {p_end}
{phang2}{cmd:. tsls y1 (y2 = z1),demean fe(panelid) replace}
{pstd} Add clustered standard errors but use the previously demeaned data {p_end}
{phang2} {cmd:. tsls y1 (y2=z1) fe(panelid) cluster(clusterid) }
{pstd} Drop the IV procedure, still using demeaned data {p_end}
{phang2} {cmd:. tsls y1 y2,fe(panelid) }{p_end}
{pstd} Check the IV result against {cmd: xtivreg2} {p_end}
{phang2}{cmd:. restore} {p_end}
{phang2} {cmd:. xtivreg2 y1 (y2 = z1) vce(clustervar clusterid) absorb(panelid)} {p_end}
{title:Notes}
{pstd} Please note that if any regressions expecting demeaned data refer to
variables that are not demeaned the result will be incorrect. Hence
the order of commands in the example.{p_end}
{pstd} Variables listed in {it varlist2} and {it varlist_iv} must not
overlap with any variables listed among {it indepvars}.
{pstd}