You may be interested in the CPS statistical imputation of capital gains and
itemized deductions but there are also more mundane issues when using the CPS. Here is the
word from Census on the calculation of taxes in the March CPS. The numbered questions are mine,
the response is below each question.
Daniel Feenberg
From amy.b.ohara@census.gov Thu Feb 14 14:33:45 2008
Date: Thu, 14 Feb 2008 15:33:05 -0500
From: amy.b.ohara
To: feenberg, david.s.johnson
Cc: charles.t.nelson
Subject: Tax model
1) What is FILESTAT for the spouse of a taxpayer?
Married taxpayers will have FILESTAT <= 3 but only one of the spouses will
have tax values.
2) What is DEP-STAT for the spouse of a taxpayer?
In the March 2007 file (tax year 2006), DEP_STAT was blank for the spouse,
but in March 2006 (tax year 2006), it pointed to the spouse with the tax
values. I would prefer to give you the count of exemptions from the unit
formation logic rather than use DEP_STAT.
3) Are the dollar amounts present on all person records, or only the
taxpayer record?
Tax model amounts (e.g., AGI, taxable income) are only on the taxpayer
record.
4) Is the taxpayer record always before the spouse and dependent records?
Not necessarily, person identifiers (A_LINENO or PPPOS) are already on the
file when the units are created.
5) Where does CAP-GAIN come from?
CAP_GAIN and CAP_LOSS are imputed from IRS SOI public use data.
6) Are the tax amounts calculated from the top-coded incomes?
The internal file is used for the tax calculator and some of the resulting
tax values are topcoded.
7) Are there imputations for deductions?
Yes, itemized deductions are imputed from IRS SOI public use data.
8>It may be possible to create a tax-unit within family value equal to
DEP-STAT for dependents, PPPOS for the taxpayer and A-SPOUSE for the
spouse. For a given tax unit, all three of those values should be the
same. But can I tell which is the taxpayer and which is the spouse from
the person record itself (and not needing to refer to other nearby
records)??? The information in the data dictionary doesn't seem to say.
The key is to be able to create the tax-unit variable without reference to
other records, which would make the work much more complicated. Once one
has the tax-unit id, then packages such as SAS or Stata provide easy
procedures for summation over the members of the tax unit.
FILESTAT is a recode of an internal variable called FILEST which can take
values of 1=single, 2=married, 4=head for taxpayers. For records with
FILEST, total exemptions and income are summed. I just looked over the
list of fields that the internet interface of Taxsim uses and I believe the
tax unit setup program generates all the required variables. An internal
extract of all cases where FILEST ne 0 should run. Is it necessary to have
wages, interest, dividends, etc entered separately? I have a rollup called
TOTINC that covers up to line 22 of the 1040. If we substituted that for
wages and zero out the other income sources, would that work?