-------------------------------------------------------------------------------
help for taxpuf27
-------------------------------------------------------------------------------
NBER TAXSIM model for federal and state income taxes - Full
-----------------------------------------------------------
Description
-----------
taxpuf27[,full output secondary|interest|long temp]
calculates federal and state income tax liability from a transformed
version of the SOI public use file. Where the Stata procedure taxsim32
uses a few input variables likely to be available in a survey, taxpuf27
uses all the data available in the public use files, about 200 values per
taxpayer.
The TAXSIM version of the public use file is documented elsewhere, but
includes variables named data1 through data210 for various income,
deduction, and demographic characteristics. data100 is the taxpayer id
variable, data11 is wages, etc. A complete list is at
http://www.nber.org/taxsim-ndx.txt.
One of the TAXSIM PUF files must be in the workspace before calling
{hi:taxpuf27). The program returns your Stata workspace after creating a
file taxsim_out.dta with values for the various liabilities and marginal
tax rates. The two files can be merged for further analysis.
Here is an example of a complete job to calculate year 2000 tax
liabilities with taxsim and compare them to taxpayer reported
liabilities:
. use /home/data/soi/taxsim/dta/s2000
. taxpuf27
. fiitax = max(0,c1)
. reg fiitax data16 [pw=data1
This loads the year 2000 2% subset, calculates taxes, merges taxsim
output with the original data, truncates tax liability at zero (to match
SOI conventions), and regresses the TAXSIM calculated value of federal
liability on the taxpayer reported value using probabilty weights. You
should see an r-squared value of better than .99.
The law used is for the year specied in data103. It can be set to any
value from 1960 to 2018, however state tax will be zero for any year
outside 1977-2018. When calculating tax liabilities for alternate years,
there may be missing variables, however these are silently set to zero.
The tax calculator itself is the same FORTRAN program that the NBER has
been updating annually since 1974. This interface converts the data to
ASCII and executes the tax calcultor against it, then reads the output of
the tax calculator and converts it to a Stata dataset. The full tax
calculator is available only while logged onto the NBER Unix cluster, not
via the Internet.
Data
----
Input files are available for all years from 1960 through 2018, except
for 1961, 1963, and 1965. Each year is available as a 2% subset (about
2,000 taxpayers) or a full version. File x1999.dta would be the full
dataset, where s1999.dta is the subset. All files are kept in
/home/data/soi/taxsim/dta.
Options
-------
output: Specify the name of the output dataset. The default is
taxsim_out.dta in the current directory.
secondary: Calculate marginal tax rates with respect to the secondary
wage earner. The default is a weighted average of the primary and
secondary wage earners.
interest: Calculate marginal tax rates with respect to interest income.
long: Calculate marginal tax rates with respect to long term gains.
temp: Save temporary files to disk.
Notes:
------
{p4 4 2} Please examine or read all of the material below.
Dollar amounts are rounded to the nearest dollar before transmission to
the calculator, and calculated amounts are similarly treated.
A general description of Taxsim is given in
http://www.nber.org/taxsim/feenberg-coutts.pdf.
More information about the data collection at NBER is given in
http://www.nber.org/taxsim-notes.html.
A variable list is given in http://www.nber.org/taxsim-ndx.txt.
Daniel Feenberg
feenberg@nber.org
617-863-0343
Online: help for taxpuf