Preliminary Estimates of the Number of U.S. Workers Using a New Methodology for Assigning Geographic and Demographic Information in Administrative Data Records
Research and Statistics Note No. 2024-02 (released December 2024)
Michael Compson is with the Office of Research, Evaluation, and Statistics, Office of Retirement and Disability Policy, Social Security Administration.
Acknowledgments: I would like to thank Pat Purcell, Richard Chard, Glenn Springstead, Gayle Reznik, Safaa Amer, Bill Piet, and Sven Sinclair for comments on the note; Ben Pitkin and Jessie Dalrymple for editorial assistance; and Lewis Gaul for programming assistance. I dedicate this note to Attiat Ott, Jyh-Horng Lin, Gladstone Hutchinson, Bonnie Orcutt, Janice Yee, and Ute Schumacher from Clark University; Ron Durst, Steve Koenig, and Pat Sullivan from the Economic Research Service at the Department of Agriculture; John Kitchen, Mike Springer, Rachel Cononi, and John Hamber from the Department of the Treasury; Greg Diez, John Hennessey, Cherice Jefferies, Angela Harper, Theresa Wolf, John Qatsha, Paul Davies, Bill Piet, Russ Hudson, Hansa Patel, and Sam Foster from the Social Security Administration; and all of the many others who have helped me throughout my career.
Contents of this publication are not copyrighted; any items may be reprinted, but citation is requested. The findings and conclusions presented in this note are those of the author and do not necessarily represent the views of the Social Security Administration.
Introduction
CWHS | Continuous Work History Sample |
IRS | Internal Revenue Service |
MGD | Master Geographic and Demographic |
OCACT | Office of the Chief Actuary |
ORES | Office of Research, Evaluation, and Statistics |
SCC | state and county code |
SSA | Social Security Administration |
SSN | Social Security number |
The Office of Research, Evaluation, and Statistics (ORES) in the Social Security Administration (SSA) is developing a new methodology to generate estimates of U.S. employment and earnings for two of its annual statistical publications: the Annual Statistical Supplement to the Social Security Bulletin (hereafter, the Annual Statistical Supplement) and Earnings and Employment Data for Workers Covered Under Social Security and Medicare, by State and County (hereafter, Earnings and Employment).1 The new methodology will enable ORES to use a vastly larger sample of workers than is allowed by the methodology currently used to generate those estimates. This research and statistics note follows two Social Security Bulletin articles that detail the complex multistep process of developing the new methodology. Compson (2022) identifies the limitations of the existing methodology and describes the new methodology for assigning state of residence codes and identifying demographic information for the population of workers with tax records for earnings in 2017. The new methodology enables ORES to compile a Master Geographic and Demographic (MGD) data file, which contains information on state of residence, year of birth, and sex for a far greater number of workers—nearly all of those for whom earnings records were filed with the Internal Revenue Service (IRS)—in a given year.2
Compson (2024) describes how ORES applied the new methodology to extend the 2017 MGD file for a full 7-year span (2014–2020) and uses two distinct analyses to assess the new process. The first, a procedural analysis, uses internal SSA audit reports to assess the completeness and accuracy of the new methodology in processing tax records. It focuses on the procedure for assigning a single state and county code (SCC) and identifying demographic information for each worker. For example, it examines the number of records processed, the number of workers represented in those records, and the earnings data sources (IRS Forms W-2, W-2c, and 1040 Schedule SE) for workers in a year in the 2014–2020 period. The second analysis directly compares the results of the MGD process for assigning state of residence codes and identifying worker demographic information with those of the current methodology of the statistical publications. Specifically, the second analysis processes the underlying worker-level microdata using both the existing methodology and the MGD process, allowing a direct comparison of estimates resulting from the two methodologies.
To date, SSA has been unable to use microdata to provide the geographic and demographic characteristics of U.S. workers because the agency's 1-percent Continuous Work History Sample (CWHS) has been the only available data source. By enabling earnings and employment estimates based on administrative microdata for nearly all workers, the MGD files vastly expand labor market research possibilities. This note presents preliminary estimates generated using the MGD process so that federal statistical agencies that collect, analyze, or release U.S. labor market data, and other interested researchers and policy analysts, can assess the results and provide feedback.3
The worker population estimates generated by the MGD process and presented in this note are shown variously by sex, age, state, and type of earnings (wage and salary, self-employment) for the years 2014 through 2021. The MGD estimates are compared with two different benchmarks: estimates generated using the current methodology and published in SSA annual statistical publications, and unpublished estimates prepared independently by SSA's Office of the Chief Actuary (OCACT).
The note also highlights several novel approaches that emerged as ORES developed the MGD process. For example, the MGD file enabled ORES to identify three mutually exclusive earnings-type categories (wage and salary only, self-employment only, and both types in combination) to provide new insights on the U.S. workforce. Additionally, because the new methodology allows ORES to use microdata for the entire population of workers instead of a 1 percent sample, SSA publications will be able to include estimates for small jurisdictions that are currently suppressed to comply with data disclosure restrictions for small sample sizes. For example, the publications currently combine estimates for American Samoa, Guam, the Northern Mariana Islands, and the U.S. Virgin Islands into a single “other outlying areas” category. The MGD process will allow SSA to generate and publish estimates for each territory individually.
This note also shows how maps can be used to present worker counts by state and county and describes the process with which worker counts by state can be revised to reflect newly arriving data. Currently, the state- and county-level worker count estimates in the statistical publications are based on the tax records processed in the calendar year that immediately follows the tax year (that is, the year when the reported income was earned). Once published, those estimates are not updated to reflect any records for that tax year that were not processed until later years. Although those records represent a very small proportion of the tax year records, their exclusion might affect the precision of the estimates. The new methodology will allow ORES to update its estimates based on more complete data.
Background
In 2023, ORES produced the MGD file that contains geographic and demographic data for the population of workers with earnings records for tax year 2021, which were processed in calendar year 2022. ORES now has assembled MGD files for each tax year from 2014 through 2021. This note presents preliminary estimates from those files. I refer to the estimates as preliminary because the development of the new estimation method is ongoing, and the MGD component focuses solely on geographic and demographic information. The MGD files do not contain any information regarding the type or amount of earnings because the address information is the only data extracted from a worker's tax records to assign state and county of residence. As a result, the MGD files alone cannot be used to determine whether the earnings reported on the individual's tax records are covered or not covered under the Social Security or Medicare programs, nor can they provide the corresponding earnings amounts. Thus, for this note, the estimates are limited to worker counts, which are assumed to approximate the number of all workers subject to the Medicare Hospital Insurance payroll tax, because the Medicare tax is nearly universal for the U.S. workforce.4
Second, the MGD files contain data only for the primary tax year that were processed in the following calendar year. Each year, SSA processes the hundreds of millions of IRS Forms W-2 and W-2c (filed by employers) and the millions of Form 1040 Schedule SEs (filed by the self-employed) it receives from the IRS.5 The primary tax year for the records SSA processed in 2021 was 2020. However, during a calendar year, SSA also receives and processes some records for tax years other than the primary tax year. These nonprimary tax year data are not included in the MGD files.
Table 1 shows the number of primary and nonprimary tax year records processed via the MGD methodology each year from 2015 through 2022. It highlights the predominance of the primary tax year's records in each processing year, even as SSA continues to receive additional records for a given tax year for subsequent processing years. For example, in 2015, SSA processed 259,791,044 records for the primary tax year 2014. SSA also processed 564 records for tax year 2015 that year, showing that employers and self-employed individuals occasionally file tax forms before the end of the earnings year. Earnings records for tax year 2014 continued to arrive in SSA for processing each year thereafter, although the flow rapidly dwindled. The pattern recurs in subsequent tax years.
Tax year | Total records processed | Processing year | Total non-primary tax year records processed a | |||||||
---|---|---|---|---|---|---|---|---|---|---|
2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | |||
2014 | ||||||||||
Number | 263,933,101 | 259,791,044 | 2,201,919 | 1,511,457 | 258,316 | 98,717 | 26,303 | 27,469 | 17,876 | 4,142,057 |
Percent | 100.00 | 98.43 | 0.83 | 0.57 | 0.10 | 0.04 | 0.01 | 0.01 | 0.01 | 1.57 |
2015 | ||||||||||
Number | 273,432,854 | 564 | 269,436,834 | 2,968,364 | 556,611 | 314,294 | 49,169 | 68,105 | 38,913 | 3,996,020 |
Percent | 100.00 | (L) | 98.54 | 1.09 | 0.20 | 0.11 | 0.02 | 0.02 | 0.01 | 1.46 |
2016 | ||||||||||
Number | 281,673,837 | . . . | 319,278 | 278,488,758 | 1,924,877 | 590,378 | 161,381 | 110,847 | 78,318 | 3,185,079 |
Percent | 100.00 | . . . | 0.11 | 98.87 | 0.68 | 0.21 | 0.06 | 0.04 | 0.03 | 1.13 |
2017 | ||||||||||
Number | 282,182,333 | . . . | . . . | 233,222 | 279,435,723 | 1,737,114 | 404,899 | 243,365 | 128,010 | 2,746,610 |
Percent | 100.00 | . . . | . . . | 0.08 | 99.03 | 0.62 | 0.14 | 0.09 | 0.05 | 0.97 |
2018 | ||||||||||
Number | 287,745,177 | . . . | . . . | . . . | 309,506 | 284,888,320 | 1,712,352 | 550,783 | 284,216 | 2,856,857 |
Percent | 100.00 | . . . | . . . | . . . | 0.11 | 99.01 | 0.60 | 0.19 | 0.10 | 0.99 |
2019 | ||||||||||
Number | 290,015,773 | . . . | . . . | . . . | . . . | 258,985 | 286,651,734 | 2,471,215 | 633,839 | 3,364,039 |
Percent | 100.00 | . . . | . . . | . . . | . . . | 0.09 | 98.84 | 0.85 | 0.22 | 1.16 |
2020 | ||||||||||
Number | 280,036,115 | . . . | . . . | . . . | . . . | . . . | 241,124 | 276,478,907 | 3,316,084 | 3,557,208 |
Percent | 100.00 | . . . | . . . | . . . | . . . | . . . | 0.09 | 98.73 | 1.18 | 1.27 |
2021 | ||||||||||
Number | 291,162,336 | . . . | . . . | . . . | . . . | . . . | . . . | 240,812 | 290,921,524 | 240,812 |
Percent | 100.00 | . . . | . . . | . . . | . . . | . . . | . . . | 0.08 | 99.92 | 0.08 |
2022 | ||||||||||
Number | 253,805 | . . . | . . . | . . . | . . . | . . . | . . . | . . . | 253,805 | 0 |
Percent | 100.00 | . . . | . . . | . . . | . . . | . . . | . . . | . . . | 100.00 | 0.00 |
SOURCE: Author's calculations based on SSA data processing audit reports. | ||||||||||
NOTES: The primary tax year is the year immediately preceding the processing year.
Rounded components of percentage distributions do not necessarily sum to 100.00.
(L) = less than 0.005; . . . = not applicable.
|
||||||||||
a. Subject to change because the number and share of nonprimary tax year records increases from year to year. |
The MGD files containing the primary tax year data currently account for more than 98 percent of all the earnings records that SSA has processed for tax years 2014 to 2022. However, Table 1 also shows that about 24 million nonprimary tax year records are excluded from the MGD files. Depending on the mix of the nonprimary tax year records and the number of “new” workers whose data are not included in the MGD file, the effect of these additional records is likely to be small, compared with the vastly higher number of worker records processed for a given primary tax year. Nevertheless, ORES is currently assessing methodologies to incorporate the nonprimary tax year data into the MGD files.
Although the figures in this note are preliminary, they highlight the progress to date in developing a new methodology for generating annual employment and earnings estimates. For counts of all workers and of wage and salary workers, the preliminary estimates derived from the MGD files are good proxies for the counts of workers covered under Medicare, as estimated and published in statistical publications both by ORES and, separately, by OCACT. However, the numbers of self-employed individuals estimated using the MGD files differ more substantially from those in the published tables. These greater differences may indicate a problem with the MGD files as currently structured. The greater differences between MGD estimates and published tables in the counts of self-employed individuals also arise when comparing results for individuals with self-employment income only, and those with both wage and salary and self-employment income, in a single year.
The preliminary results also raise other concerns with the MGD files. First, there are reasons to question the accuracy of the MGD file for tax year 2014, the earliest year of data currently available. For that year, an SCC based on the address information in the tax forms could not be assigned for an anomalously high number of job-level records.6 Second, a few records in the tax year 2021 MGD file identify a valid state of residence but an unknown county, despite having an SCC that should identify the county.
Identifying Types of Earnings
SSA statistical publications include tables showing estimates of earnings and employment by type of earnings (wage and salary, self-employment income). The tax records underlying the MGD process—Forms W-2 and W-2c and Schedule SE—enable ORES to identify workers as belonging under those categories.7 For a given tax year, a worker's earnings will be tracked on one or more tax forms, in one of seven mutually exclusive categories:
- W-2 only
- W-2c only
- Schedule SE only
- W-2 and W-2c
- W-2 and Schedule SE
- W-2c and Schedule SE
- W-2, W-2c, and Schedule SE
Given these data source categories, ORES determines whether a worker had wage and salary earnings or self-employment income. Yet these categories also enable a more detailed earnings-type subcategorization not shown in the ORES published tables: workers with wage and salary earnings only, individuals with self-employment income only, and those with both types of earnings during the year (so-called combination workers). This additional detail offers insights into the U.S. labor marker that are not available in the tables SSA currently publishes.
The preliminary estimates derived from the MGD files are based on the number of unique Social Security numbers (SSNs) associated with tax forms that reported earnings for a given year. For this analysis, the estimates are assembled and formatted to replicate selected tables from the Annual Statistical Supplement and Earnings and Employment.
Evaluating the MGD Process Results
The analysis compares MGD-process estimates of the population of workers for 2014 through 2021 against unpublished estimates prepared by OCACT as inputs for estimates published in the Annual Report of the Board of Trustees of the Federal Old-Age and Survivors Insurance and Federal Disability Insurance Trust Funds and the estimates published by ORES in the Annual Statistical Supplement and Earnings and Employment. OCACT separately estimates the numbers of workers covered under Social Security and Medicare and distinguishes workers with wage and salary earnings from those with self-employment income. However, OCACT does not prepare worker count estimates by sex or age. Therefore, the MGD-process estimates that use those breakdowns are compared with selected tables from the ORES statistical publications.
I approach the comparison incrementally. The first step involves an analysis of the content of three key data fields in the MGD files: sex, age, and SSN. After removing records with invalid SSNs, or with missing or dubious values in the other data fields, the comparison of the MGD process with the two benchmark estimates proceeds.
I first compare the MGD worker count estimates with the OCACT estimates of Medicare-covered workers. These estimates are shown not only for all workers, wage and salary workers, and self-employed individuals—the same earnings-type categories that are used in the statistical publications—but also for workers with only wage and salary earnings, only self-employment income, and both earnings types—categories that are not found in the statistical publications. Then, I compare MGD worker count estimates with those published in the statistical publications—first, by sex and age; then, by state and other area, including some U.S. territories for which estimates are not currently available in the publications because of data disclosure restrictions. Those restrictions require ORES to suppress data that are based on unweighted sample sizes below a certain threshold. Because the statistical publications base their estimates on a 1 percent sample of workers, the estimates for many small jurisdictions must be suppressed. Therefore, this note also considers the effect of replacing the 1 percent sample with a 10 percent sample of workers on the number of county-level estimates that are suppressed to comply with data nondisclosure rules.8 The note closes with examples of maps and new tables that ORES may add to its statistical publications.
Removing Records with Invalid SSNs
Table 2 presents the number of workers whose earnings records have invalid and valid SSNs in the MGD files for tax years 2014 to 2021. An SSN is deemed valid if it is present in SSA's Numerical Identification System (Numident) administrative data file. The Numident file contains records for all SSNs ever issued. The information is derived from SSA Form SS-5, the application for an SSN, which contains the individual's name, place and date of birth, and sex. The percentage of worker records with an invalid SSN is less than 1 percent in all tax years except 2021, for which it is 1.1 percent. The MGD process cannot assign a year of birth or sex to records for workers with an invalid SSN. Further, those workers will not have earnings data in SSA's Master Earnings File (MEF). Therefore, the tax records for these workers are omitted from the analysis.
Processing year | Primary tax year | Number | Percent | ||||
---|---|---|---|---|---|---|---|
Total | Valid | Invalid | Total | Valid | Invalid | ||
2015 | 2014 | 170,260,465 | 168,962,452 | 1,298,013 | 100.00 | 99.24 | 0.76 |
2016 | 2015 | 174,002,077 | 172,610,971 | 1,391,106 | 100.00 | 99.20 | 0.80 |
2017 | 2016 | 176,723,136 | 175,237,389 | 1,485,747 | 100.00 | 99.16 | 0.84 |
2018 | 2017 | 178,863,694 | 177,339,293 | 1,524,401 | 100.00 | 99.15 | 0.85 |
2019 | 2018 | 181,131,038 | 179,553,005 | 1,578,033 | 100.00 | 99.13 | 0.87 |
2020 | 2019 | 182,622,507 | 181,050,599 | 1,571,908 | 100.00 | 99.14 | 0.86 |
2021 | 2020 | 181,232,792 | 179,465,649 | 1,767,143 | 100.00 | 99.02 | 0.98 |
2022 | 2021 | 183,375,419 | 181,305,089 | 2,070,330 | 100.00 | 98.87 | 1.13 |
SOURCE: Author's calculations based on SSA data processing audit reports. |
Sex of Workers with Valid SSNs
Table 3 presents the number of records for workers with valid SSNs by the type of sex identifier shown in the Numident file: men, women, missing, and unknown. Workers whose records indicate a missing or unknown sex identifier represent approximately 0.5 percent of all workers with valid SSNs. ORES' published earnings tables do not include workers with a missing or unknown sex identifier.9 Therefore, these records are removed from the MGD file for this analysis.10
Processing year | Primary tax year | Total | Men | Women | Missing | Unknown | Total excluding "missing" and "unknown" |
---|---|---|---|---|---|---|---|
Number | |||||||
2015 | 2014 | 168,962,452 | 86,658,038 | 81,417,527 | 815,347 | 71,540 | 168,075,565 |
2016 | 2015 | 172,610,971 | 88,566,852 | 83,171,765 | 802,317 | 70,037 | 171,738,617 |
2017 | 2016 | 175,237,389 | 89,771,356 | 84,606,950 | 790,904 | 68,179 | 174,378,306 |
2018 | 2017 | 177,339,293 | 90,785,235 | 85,719,996 | 768,051 | 66,011 | 176,505,231 |
2019 | 2018 | 179,553,005 | 91,783,693 | 86,953,608 | 751,651 | 64,053 | 178,737,301 |
2020 | 2019 | 181,050,599 | 92,354,312 | 87,900,714 | 733,572 | 62,001 | 180,255,026 |
2021 | 2020 | 179,465,649 | 91,436,558 | 87,269,927 | 699,852 | 59,312 | 178,706,485 |
2022 | 2021 | 181,305,089 | 92,377,562 | 88,172,771 | 697,568 | 57,188 | 180,550,333 |
Percent | |||||||
2015 | 2014 | 100.00 | 51.29 | 48.19 | 0.48 | 0.04 | 99.48 |
2016 | 2015 | 100.00 | 51.31 | 48.18 | 0.46 | 0.04 | 99.49 |
2017 | 2016 | 100.00 | 51.23 | 48.28 | 0.45 | 0.04 | 99.51 |
2018 | 2017 | 100.00 | 51.19 | 48.34 | 0.43 | 0.04 | 99.53 |
2019 | 2018 | 100.00 | 51.12 | 48.43 | 0.42 | 0.04 | 99.55 |
2020 | 2019 | 100.00 | 51.01 | 48.55 | 0.41 | 0.03 | 99.56 |
2021 | 2020 | 100.00 | 50.95 | 48.63 | 0.39 | 0.03 | 99.58 |
2022 | 2021 | 100.00 | 50.95 | 48.63 | 0.39 | 0.03 | 99.58 |
SOURCE: Author's calculations based on SSA data processing audit reports. | |||||||
NOTE: Rounded components of percentage distributions do not necessarily sum to 100.00. |
Age of Workers
For the MGD files, age is identified only for those workers whose records have a valid SSN and a male or female sex identifier. Because year of birth is sometimes entered incorrectly or has not been validated in the administrative data files, some records indicate worker ages of 0 (or negative years) and others indicate ages of 100 or more. Validating the year of birth in a worker's record may not occur before the individual applies for benefits. As a result, the administrative file may contain erroneous data for the small number of workers whose year of birth was entered incorrectly. For this analysis, the records for workers whose indicated age in the administrative data was less than 1 or greater than 99 were removed from the MGD files. Table 4 shows the workers' records by age group. The omitted records accounted for approximately 0.12 percent of workers in each tax year.11
Tax year | Total | Age group | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1–19 | 20–29 | 30–39 | 40–49 | 50–59 | 60–69 | 70–79 | 80–89 | 90–99 | Other (omitted) | ||
Number | |||||||||||
2014 | 168,075,565 | 9,354,556 | 37,295,556 | 33,748,958 | 32,850,996 | 32,674,509 | 17,360,603 | 3,857,239 | 575,610 | 156,506 | 201,032 |
2015 | 171,738,617 | 9,787,238 | 38,052,202 | 34,746,451 | 32,884,493 | 33,010,252 | 18,230,604 | 4,060,717 | 600,205 | 162,569 | 203,886 |
2016 | 174,378,306 | 10,128,058 | 38,611,294 | 35,585,092 | 32,867,668 | 32,989,336 | 18,849,674 | 4,348,224 | 619,048 | 169,870 | 210,042 |
2017 | 176,505,231 | 10,344,670 | 38,917,518 | 36,284,382 | 32,997,976 | 32,879,436 | 19,407,851 | 4,676,885 | 630,347 | 172,460 | 193,706 |
2018 | 178,737,301 | 10,552,875 | 39,147,652 | 37,074,020 | 33,092,647 | 32,810,860 | 20,025,230 | 4,988,873 | 657,386 | 178,274 | 209,484 |
2019 | 180,255,026 | 10,680,854 | 39,200,689 | 37,776,540 | 33,197,084 | 32,691,758 | 20,456,250 | 5,189,152 | 668,324 | 182,229 | 212,146 |
2020 | 178,706,485 | 10,028,657 | 38,234,735 | 37,896,190 | 32,922,690 | 32,417,085 | 20,776,419 | 5,359,425 | 680,290 | 183,515 | 207,479 |
2021 | 180,550,333 | 11,356,370 | 38,311,751 | 38,241,081 | 32,973,127 | 32,176,815 | 20,925,598 | 5,438,688 | 693,070 | 197,349 | 236,484 |
Percent | |||||||||||
2014 | 100.00 | 5.57 | 22.19 | 20.08 | 19.55 | 19.44 | 10.33 | 2.29 | 0.34 | 0.09 | 0.12 |
2015 | 100.00 | 5.70 | 22.16 | 20.23 | 19.15 | 19.22 | 10.62 | 2.36 | 0.35 | 0.09 | 0.12 |
2016 | 100.00 | 5.81 | 22.14 | 20.41 | 18.85 | 18.92 | 10.81 | 2.49 | 0.36 | 0.10 | 0.12 |
2017 | 100.00 | 5.86 | 22.05 | 20.56 | 18.70 | 18.63 | 11.00 | 2.65 | 0.36 | 0.10 | 0.11 |
2018 | 100.00 | 5.90 | 21.90 | 20.74 | 18.51 | 18.36 | 11.20 | 2.79 | 0.37 | 0.10 | 0.12 |
2019 | 100.00 | 5.93 | 21.75 | 20.96 | 18.42 | 18.14 | 11.35 | 2.88 | 0.37 | 0.10 | 0.12 |
2020 | 100.00 | 5.61 | 21.40 | 21.21 | 18.42 | 18.14 | 11.63 | 3.00 | 0.38 | 0.10 | 0.12 |
2021 | 100.00 | 6.29 | 21.22 | 21.18 | 18.26 | 17.82 | 11.59 | 3.01 | 0.38 | 0.11 | 0.13 |
SOURCE: Author's calculations based on SSA data processing audit reports. | |||||||||||
NOTES: Omitted records are for workers with indicated ages of younger than 1 or older than 99.
Rounded components of percentage distributions do not necessarily sum to 100.00.
|
All Adjustments
Table 5 summarizes the adjustments and shows that they remove about 1.5 percent of worker-level records in the MGD files each year (from 1.4 percent for tax year 2014 to 1.7 percent for tax year 2021). Nearly all the increase in removals over time is due to the rising number of records with invalid SSNs, from 0.8 percent for 2014 to 1.1 percent for 2022. About 0.5 percent of records for 2014 were removed with missing or unknown values for sex, as were about 0.4 percent for 2021. Removing records with outlier age values reduced the number of worker-level records by 0.11 percent to 0.13 percent over the years.
Tax year | Total records at outset of processing | Records removed because of— | Total records removed | Records remaining in file | ||
---|---|---|---|---|---|---|
Invalid SSNs | Missing or unknown sex identifier | Age identifier not within 1–99 | ||||
Number | ||||||
2014 | 170,260,465 | 1,298,013 | 886,887 | 201,032 | 2,385,932 | 167,874,533 |
2015 | 174,002,077 | 1,391,106 | 872,354 | 203,886 | 2,467,346 | 171,534,731 |
2016 | 176,723,136 | 1,485,747 | 859,083 | 210,042 | 2,554,872 | 174,168,264 |
2017 | 178,863,694 | 1,524,401 | 834,062 | 193,706 | 2,552,169 | 176,311,525 |
2018 | 181,131,038 | 1,578,033 | 815,704 | 209,484 | 2,603,221 | 178,527,817 |
2019 | 182,622,507 | 1,571,908 | 795,573 | 212,146 | 2,579,627 | 180,042,880 |
2020 | 181,232,792 | 1,767,143 | 759,164 | 207,479 | 2,733,786 | 178,499,006 |
2021 | 183,375,419 | 2,070,330 | 754,756 | 236,484 | 3,061,570 | 180,313,849 |
Percent | ||||||
2014 | 100.00 | 0.76 | 0.52 | 0.12 | 1.40 | 98.60 |
2015 | 100.00 | 0.80 | 0.50 | 0.12 | 1.42 | 98.58 |
2016 | 100.00 | 0.84 | 0.49 | 0.12 | 1.45 | 98.55 |
2017 | 100.00 | 0.85 | 0.47 | 0.11 | 1.43 | 98.57 |
2018 | 100.00 | 0.87 | 0.45 | 0.12 | 1.44 | 98.56 |
2019 | 100.00 | 0.86 | 0.44 | 0.12 | 1.41 | 98.59 |
2020 | 100.00 | 0.98 | 0.42 | 0.11 | 1.51 | 98.49 |
2021 | 100.00 | 1.13 | 0.41 | 0.13 | 1.67 | 98.33 |
SOURCE: Author's calculations based on SSA data processing audit reports. | ||||||
NOTE: Rounded percentages do not necessarily sum to subtotals. |
Notes on the Benchmark Estimates
The benchmarks with which the preliminary MGD-process estimates are compared in this analysis are the unpublished OCACT estimates of the total Medicare-covered worker population for 2023 and ORES estimates of the worker population by type of earnings, sex, and age from the Annual Statistical Supplement and Earnings and Employment.
Comparing MGD-Process and OCACT Estimates of the Total Worker Population
Table 6 compares the MGD-process and OCACT estimates of the number of workers by type of earnings. The MGD-process estimates for all workers are very similar to the OCACT estimates for Medicare-covered workers. The estimates differ by less than 1 percent in all years, and by less than 0.5 percent for 2015–2021. The percentage differences for all workers with wage and salary earnings are also relatively small, varying only between 0.2 percent and 0.5 percent across the years. However, for all individuals with self-employment earnings, the absolute value of the percentage differences ranges from a low of 3.8 percent for 2021 to as high as 14.0 percent for 2014—much higher than the differences for all workers and all wage and salary workers.
Tax year | MGD-process estimate | OCACT estimate | Percentage difference |
---|---|---|---|
All workers | |||
2014 | 167,874,533 | 169,309,691 | -0.85 |
2015 | 171,534,731 | 172,035,347 | -0.29 |
2016 | 174,168,264 | 174,570,580 | -0.23 |
2017 | 176,311,525 | 176,669,106 | -0.20 |
2018 | 178,527,817 | 179,027,375 | -0.28 |
2019 | 180,042,880 | 180,935,106 | -0.49 |
2020 | 178,499,006 | 178,935,994 | -0.24 |
2021 | 180,313,849 | 180,376,184 | -0.03 |
Workers with wage and salary earnings | |||
Total | |||
2014 | 158,682,500 | 158,414,678 | 0.17 |
2015 | 161,480,185 | 161,138,941 | 0.21 |
2016 | 164,171,152 | 163,621,105 | 0.34 |
2017 | 166,165,895 | 165,592,950 | 0.34 |
2018 | 168,335,976 | 167,817,387 | 0.31 |
2019 | 170,228,900 | 169,552,223 | 0.40 |
2020 | 168,662,331 | 168,028,227 | 0.38 |
2021 | 169,486,721 | 168,627,700 | 0.51 |
With wage and salary earnings only | |||
2014 | 150,590,244 | 149,210,580 | 0.92 |
2015 | 152,466,398 | 151,757,905 | 0.47 |
2016 | 154,975,096 | 154,147,061 | 0.54 |
2017 | 156,878,918 | 156,103,819 | 0.50 |
2018 | 158,899,524 | 158,109,466 | 0.50 |
2019 | 161,014,319 | 159,811,288 | 0.75 |
2020 | 159,946,870 | 159,052,811 | 0.56 |
2021 | 159,504,951 | 160,332,750 | -0.52 |
Workers with self-employment earnings | |||
Total | |||
2014 | 17,284,289 | 20,099,111 | -14.00 |
2015 | 19,068,333 | 20,277,442 | -5.96 |
2016 | 19,193,168 | 20,423,519 | -6.02 |
2017 | 19,432,607 | 20,565,287 | -5.51 |
2018 | 19,628,293 | 20,917,910 | -6.17 |
2019 | 19,028,561 | 21,123,818 | -9.92 |
2020 | 18,552,136 | 19,883,184 | -6.69 |
2021 | 20,808,898 | 20,043,434 | 3.82 |
With self-employment earnings only | |||
2014 | 9,192,033 | 9,204,098 | -0.13 |
2015 | 10,054,546 | 9,381,036 | 7.18 |
2016 | 9,997,112 | 9,474,044 | 5.52 |
2017 | 10,145,630 | 9,489,131 | 6.92 |
2018 | 10,191,841 | 9,707,921 | 4.98 |
2019 | 9,813,980 | 9,740,935 | 0.75 |
2020 | 9,836,675 | 8,975,416 | 9.60 |
2021 | 10,827,128 | 8,294,950 | 30.53 |
Workers with both wage and salary and self-employment earnings | |||
2014 | 8,092,256 | 10,895,013 | -25.73 |
2015 | 9,013,787 | 10,896,406 | -17.28 |
2016 | 9,196,056 | 10,949,475 | -16.01 |
2017 | 9,286,977 | 11,076,156 | -16.15 |
2018 | 9,436,452 | 11,209,988 | -15.82 |
2019 | 9,214,581 | 11,382,882 | -19.05 |
2020 | 8,715,461 | 10,907,767 | -20.10 |
2021 | 9,981,770 | 11,748,484 | -15.04 |
SOURCE: Author's calculations based on SSA data processing audit reports. |
For workers with wage and salary earnings only, Table 6 shows that the differences between the MGD-process and the OCACT estimates range from 0.9 percent for 2014 to minus 0.5 percent for 2021. The MGD-process and OCACT estimates for the two remaining categories—self-employed individuals, either with or without any wage and salary earnings—are very different and likely underlie the large divergence in estimates for all self-employed individuals noted above. The differences for individuals with only self-employment income range from 30.5 percent higher for the MGD estimates for 2021 to 0.1 percent lower for 2014. The differences for workers with both types of earnings ranges from 15.0 percent lower for the MGD process for 2021 to 25.7 percent lower for 2014.
The wide differences in MGD-process and OCACT estimates of self-employed individuals have two possible explanations. First, the MGD files cannot differentiate workers who were covered under Medicare from those who are not. Thus, the MGD process can only approximate the number of Medicare-covered workers. However, given that the MGD-process and OCACT estimates for wage and salary workers are so similar and the fact that Medicare coverage is nearly universal, this explanation does not seem viable. A direct comparison will have to await the development of the new methodology for generating the annual earnings estimates, which will include a process for distinguishing between workers with Social Security (Old-Age, Survivors, and Disability Insurance) taxable earnings and those with Medicare (Hospital Insurance) taxable earnings.
The second possible explanation is that the MGD files contain only the primary tax year data processed in the next calendar year. Table 1 showed that additional workers' records are processed in subsequent calendar years. For example, as of 2022, 1.57 percent of the tax year 2014 records were processed in a year other than 2015 and were therefore omitted from the MGD file. It is possible that tax records processed in nonprimary processing years are proportionally higher for workers with self-employment income, by margins wide enough to narrow the differences between the MGD-process estimates and the OCACT estimates. To consider this possibility, I examined the distribution of the 4,142,057 additional tax year 2014 records among W-2s, W-2cs, and Schedule SEs.
Table 7 shows that in each year from 2015 through 2022, at least 99 percent of the Form W-2s that SSA processed were for the primary tax year, but between 4.5 percent and 10.7 percent of the Schedule SEs processed were for a nonprimary tax year. The 10.7 percent peak occurred in 2021, likely because of a large pandemic-related backlog that IRS experienced that year. Thus, although self-employed individuals are much fewer in number than wage and salary workers, their proportions among nonprimary tax year record processing likely explains at least some of the differences between the MGD-process and OCACT estimates for self-employed individuals from 2014 to 2020.12
Processing year | Primary tax year | Records processed | Number of unique SSNs | ||||||
---|---|---|---|---|---|---|---|---|---|
Number | Percent | For primary tax year | For other tax year | ||||||
Total | For primary tax year | For other tax year | Total | For primary tax year | For other tax year | ||||
Form W-2 | |||||||||
2015 | 2014 | 237,765,591 | 235,615,820 | 2,149,771 | 100.00 | 99.10 | 0.90 | 160,535,225 | 2,007,148 |
2016 | 2015 | 245,528,242 | 243,723,231 | 1,805,011 | 100.00 | 99.26 | 0.74 | 163,366,783 | 1,704,720 |
2017 | 2016 | 251,509,338 | 249,530,278 | 1,979,060 | 100.00 | 99.21 | 0.79 | 166,000,893 | 1,839,742 |
2018 | 2017 | 254,788,713 | 253,365,171 | 1,423,542 | 100.00 | 99.44 | 0.56 | 168,108,594 | 1,329,548 |
2019 | 2018 | 259,798,529 | 258,510,183 | 1,288,346 | 100.00 | 99.50 | 0.50 | 170,275,487 | 1,217,177 |
2020 | 2019 | 262,691,363 | 261,583,557 | 1,107,806 | 100.00 | 99.58 | 0.42 | 172,238,245 | 1,041,896 |
2021 | 2020 | 250,693,566 | 249,832,215 | 861,351 | 100.00 | 99.66 | 0.34 | 170,623,150 | 812,196 |
2022 | 2021 | 262,930,243 | 261,815,697 | 1,114,546 | 100.00 | 99.58 | 0.42 | 171,636,437 | 1,040,070 |
Form 1040 Schedule SE | |||||||||
2015 | 2014 | 18,782,168 | 17,813,779 | 968,389 | 100.00 | 94.84 | 5.16 | 17,812,721 | 728,932 |
2016 | 2015 | 20,681,589 | 19,664,474 | 1,017,115 | 100.00 | 95.08 | 4.92 | 19,663,466 | 780,799 |
2017 | 2016 | 20,764,746 | 19,804,112 | 960,634 | 100.00 | 95.37 | 4.63 | 19,803,275 | 750,329 |
2018 | 2017 | 21,194,793 | 20,050,718 | 1,144,075 | 100.00 | 94.60 | 5.40 | 20,050,006 | 908,497 |
2019 | 2018 | 21,380,446 | 20,278,455 | 1,101,991 | 100.00 | 94.85 | 5.15 | 20,277,674 | 859,115 |
2020 | 2019 | 20,531,437 | 19,601,328 | 930,109 | 100.00 | 95.47 | 4.53 | 19,601,024 | 795,290 |
2021 | 2020 | 21,618,115 | 19,308,932 | 2,309,183 | 100.00 | 89.32 | 10.68 | 19,308,531 | 2,031,568 |
2022 | 2021 | 23,826,394 | 21,714,574 | 2,111,820 | 100.00 | 91.14 | 8.86 | 21,714,034 | 1,821,127 |
SOURCE: Author's calculations based on SSA data processing audit reports. |
Comparing MGD-Process and ORES Published Estimates of Worker Counts
Table 8 shows the number of all workers by sex for 2014–2021 as estimated by the MGD process and as published in Table 4 of Earnings and Employment. Tables 9 and 10 repeat Table 8 with estimates for wage and salary workers and self-employed individuals, respectively. Tables 8 and 9 show that the MGD-estimated numbers of all and wage and salary workers are similar to those published in Earnings and Employment, differing by less than 1 percent for men and women in every year (except for all workers for 2014).
Tax year | MGD-process estimate | Published estimate | Percentage difference |
---|---|---|---|
All workers | |||
2014 | 167,874,533 | 169,691,000 | -1.07 |
2015 | 171,534,731 | 172,369,000 | -0.48 |
2016 | 174,168,264 | 175,215,999 | -0.60 |
2017 | 176,311,525 | 176,962,000 | -0.37 |
2018 | 178,527,817 | 179,584,999 | -0.59 |
2019 | 180,042,880 | 180,896,000 | -0.47 |
2020 | 178,499,006 | 178,494,000 | 0.00 |
2021 | 180,313,849 | 180,359,000 | -0.03 |
Men | |||
2014 | 86,554,357 | 87,565,826 | -1.16 |
2015 | 88,462,230 | 88,914,085 | -0.51 |
2016 | 89,664,098 | 90,201,580 | -0.60 |
2017 | 90,686,951 | 90,985,759 | -0.33 |
2018 | 91,677,783 | 92,177,295 | -0.54 |
2019 | 92,247,594 | 92,615,972 | -0.40 |
2020 | 91,332,698 | 91,257,458 | 0.08 |
2021 | 92,260,151 | 92,158,375 | 0.11 |
Women | |||
2014 | 81,320,176 | 82,125,174 | -0.98 |
2015 | 83,072,501 | 83,454,915 | -0.46 |
2016 | 84,504,166 | 85,014,420 | -0.60 |
2017 | 85,624,574 | 85,976,242 | -0.41 |
2018 | 86,850,034 | 87,407,704 | -0.64 |
2019 | 87,795,286 | 88,280,028 | -0.55 |
2020 | 87,166,308 | 87,236,543 | -0.08 |
2021 | 88,053,698 | 88,200,625 | -0.17 |
SOURCES: Author's calculations based on SSA data processing audit reports; and Earnings and Employment Data for Workers Covered Under Social Security and Medicare, by State and County, 2014–2021 editions, Table 4. | |||
NOTE: Published estimates for men and women may not sum to all-workers total because of rounding. |
Tax year | MGD-process estimate | Published estimate | Percentage difference |
---|---|---|---|
All wage and salary workers | |||
2014 | 158,682,500 | 158,852,000 | -0.11 |
2015 | 161,480,185 | 161,237,000 | 0.15 |
2016 | 164,171,152 | 164,206,999 | -0.02 |
2017 | 166,165,895 | 166,205,000 | -0.02 |
2018 | 168,335,976 | 168,352,999 | -0.01 |
2019 | 170,228,900 | 170,019,000 | 0.12 |
2020 | 168,662,331 | 168,277,000 | 0.23 |
2021 | 169,486,721 | 168,611,000 | 0.52 |
Men | |||
2014 | 81,255,302 | 81,203,373 | 0.06 |
2015 | 82,664,297 | 82,396,749 | 0.32 |
2016 | 83,907,441 | 83,800,324 | 0.13 |
2017 | 84,850,288 | 84,740,013 | 0.13 |
2018 | 85,830,038 | 85,672,680 | 0.18 |
2019 | 86,672,090 | 86,400,315 | 0.31 |
2020 | 85,777,999 | 85,425,731 | 0.41 |
2021 | 86,192,409 | 85,535,561 | 0.77 |
Women | |||
2014 | 77,427,198 | 77,648,628 | -0.29 |
2015 | 78,815,888 | 78,840,251 | -0.03 |
2016 | 80,263,711 | 80,406,675 | -0.18 |
2017 | 81,315,607 | 81,464,988 | -0.18 |
2018 | 82,505,938 | 82,680,320 | -0.21 |
2019 | 83,556,810 | 83,618,685 | -0.07 |
2020 | 82,884,332 | 82,851,269 | 0.04 |
2021 | 83,294,312 | 83,075,439 | 0.26 |
SOURCES: Author's calculations based on SSA data processing audit reports; and Earnings and Employment Data for Workers Covered Under Social Security and Medicare, by State and County, 2014–2021 editions, Table 4. | |||
NOTE: Published estimates for men and women may not sum to all-workers total because of rounding. |
Tax year | MGD-process estimate | Published estimate | Percentage difference |
---|---|---|---|
All self-employed individuals | |||
2014 | 17,284,289 | 19,862,000 | -12.98 |
2015 | 19,068,333 | 20,650,000 | -7.66 |
2016 | 19,193,168 | 20,694,000 | -7.25 |
2017 | 19,432,607 | 20,552,000 | -5.45 |
2018 | 19,628,293 | 20,985,000 | -6.47 |
2019 | 19,028,561 | 20,905,000 | -8.98 |
2020 | 18,552,136 | 19,430,000 | -4.52 |
2021 | 20,808,898 | 21,543,000 | -3.41 |
Men | |||
2014 | 9,753,284 | 11,074,485 | -11.93 |
2015 | 10,762,981 | 11,470,475 | -6.17 |
2016 | 10,807,316 | 11,691,072 | -7.56 |
2017 | 10,898,755 | 11,541,010 | -5.56 |
2018 | 10,945,950 | 11,727,973 | -6.67 |
2019 | 10,496,410 | 11,544,635 | -9.08 |
2020 | 10,230,550 | 10,727,143 | -4.63 |
2021 | 11,393,362 | 11,810,926 | -3.54 |
Women | |||
2014 | 7,531,005 | 8,787,515 | -14.30 |
2015 | 8,305,352 | 9,179,525 | -9.52 |
2016 | 8,385,852 | 9,002,928 | -6.85 |
2017 | 8,533,852 | 9,010,990 | -5.30 |
2018 | 8,682,343 | 9,257,026 | -6.21 |
2019 | 8,532,151 | 9,360,365 | -8.85 |
2020 | 8,321,586 | 8,702,857 | -4.38 |
2021 | 9,415,536 | 9,732,074 | -3.25 |
SOURCES: Author's calculations based on SSA data processing audit reports; and Earnings and Employment Data for Workers Covered Under Social Security and Medicare, by State and County, 2014–2021 editions, Table 4. | |||
NOTE: Published estimates for men and women may not sum to all-workers total because of rounding. |
The estimated numbers of self-employed individuals differ more substantially, with the MGD estimates being lower both for men and women in each year (Table 10). I expect the differences between the MGD-process and the published estimates to be consistent with the differences between the MGD-process and OCACT estimates because the published estimates reflect the OCACT estimates. Interestingly, the percentage differences in the 2014 estimates far exceed the differences for the other years. The cause of this discrepancy is not clear, but it may involve the high percentage of workers, noted earlier, who were not assigned an SCC in the MGD file for tax year 2014.
Comparing MGD-Process and ORES Published Estimates of Worker Counts by Age
ORES publishes estimates of the number of workers with Medicare taxable earnings by age, sex, and state or other area in Table 5 of Earnings and Employment. For estimates by age group, regardless of methodology, one can expect that the groups with the smallest populations (those younger than 20 and those aged 70 or older) are likely to be subject to the widest variation between the MGD-process and published estimates. Likewise, if the population entering an age group is unusually larger (or smaller) than the population aging out of it in a given year, some volatility in the estimates can be expected.
Table 11 shows the MGD and published estimates of numbers of workers with Medicare taxable earnings by age group. Interestingly, the MGD estimates for workers who are younger than 20 and those aged 20–29 are higher than the published estimates for each year. As expected, the percentage differences between the MGD and the published estimates for workers who are younger than 20 and those aged 70 or older are relatively large compared with those of the other age groups. The MGD estimates are lower than the published estimates for workers aged 30–64, although the percentage differences are not excessive.
Tax year | Total | Age group | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Younger than 20 | 20–29 | 30–39 | 40–49 | 50–59 | 60–61 | 62–64 | 65–69 | 70 or older | ||
MGD-process estimates | ||||||||||
2014 | 167,876,547 | 9,354,556 | 37,295,556 | 33,748,958 | 32,850,996 | 32,674,509 | 5,204,839 | 6,140,967 | 6,014,797 | 4,589,355 |
2015 | 171,536,746 | 9,787,238 | 38,052,202 | 34,746,451 | 32,884,493 | 33,010,252 | 5,395,968 | 6,427,233 | 6,407,403 | 4,823,491 |
2016 | 174,170,280 | 10,128,058 | 38,611,294 | 35,585,092 | 32,867,668 | 32,989,336 | 5,501,762 | 6,686,701 | 6,661,211 | 5,137,142 |
2017 | 176,313,542 | 10,344,670 | 38,917,518 | 36,284,382 | 32,997,976 | 32,879,436 | 5,640,679 | 6,887,407 | 6,879,765 | 5,479,692 |
2018 | 178,529,835 | 10,552,875 | 39,147,652 | 37,074,020 | 33,092,647 | 32,810,860 | 5,722,724 | 7,118,296 | 7,184,210 | 5,824,533 |
2019 | 180,044,899 | 10,680,854 | 39,200,689 | 37,776,540 | 33,197,084 | 32,691,758 | 5,767,466 | 7,251,765 | 7,437,019 | 6,039,705 |
2020 | 178,501,026 | 10,028,657 | 38,234,735 | 37,896,190 | 32,922,690 | 32,417,085 | 5,842,942 | 7,324,366 | 7,609,111 | 6,223,230 |
2021 | 180,315,870 | 11,356,370 | 38,311,751 | 38,241,081 | 32,973,127 | 32,176,815 | 5,878,494 | 7,362,651 | 7,684,453 | 6,329,107 |
Published estimates | ||||||||||
2014 | 169,691,000 | 8,662,871 | 37,247,740 | 34,383,167 | 33,715,360 | 33,599,332 | 5,311,791 | 6,232,942 | 6,133,348 | 4,404,449 |
2015 | 172,369,000 | 9,139,989 | 37,839,736 | 35,196,056 | 33,504,082 | 33,733,326 | 5,488,668 | 6,474,186 | 6,451,125 | 4,541,833 |
2016 | 175,215,999 | 9,522,185 | 38,501,123 | 36,114,313 | 33,516,074 | 33,727,312 | 5,624,381 | 6,768,568 | 6,668,462 | 4,773,582 |
2017 | 176,962,000 | 9,785,167 | 38,705,169 | 36,724,946 | 33,527,771 | 33,493,773 | 5,751,399 | 6,973,590 | 6,848,846 | 5,151,341 |
2018 | 179,584,999 | 10,030,325 | 39,013,651 | 37,565,640 | 33,734,133 | 33,438,032 | 5,903,435 | 7,240,876 | 7,172,040 | 5,486,868 |
2019 | 180,896,000 | 10,167,823 | 38,984,635 | 38,235,708 | 33,794,683 | 33,249,626 | 5,916,361 | 7,395,201 | 7,475,455 | 5,676,508 |
2020 | 178,494,000 | 9,560,502 | 37,948,711 | 38,180,532 | 33,365,210 | 32,723,498 | 5,931,874 | 7,432,821 | 7,593,092 | 5,757,759 |
2021 | 180,359,000 | 10,877,583 | 38,170,303 | 38,507,569 | 33,435,386 | 32,490,342 | 5,948,172 | 7,458,078 | 7,677,135 | 5,794,431 |
Percentage difference | ||||||||||
2014 | -1.07 | 7.98 | 0.13 | -1.84 | -2.56 | -2.75 | -2.01 | -1.48 | -1.93 | 4.20 |
2015 | -0.48 | 7.08 | 0.56 | -1.28 | -1.85 | -2.14 | -1.69 | -0.73 | -0.68 | 6.20 |
2016 | -0.60 | 6.36 | 0.29 | -1.47 | -1.93 | -2.19 | -2.18 | -1.21 | -0.11 | 7.62 |
2017 | -0.37 | 5.72 | 0.55 | -1.20 | -1.58 | -1.83 | -1.93 | -1.24 | 0.45 | 6.37 |
2018 | -0.59 | 5.21 | 0.34 | -1.31 | -1.90 | -1.88 | -3.06 | -1.69 | 0.17 | 6.15 |
2019 | -0.47 | 5.05 | 0.55 | -1.20 | -1.77 | -1.68 | -2.52 | -1.94 | -0.51 | 6.40 |
2020 | 0.00 | 4.90 | 0.75 | -0.74 | -1.33 | -0.94 | -1.50 | -1.46 | 0.21 | 8.08 |
2021 | -0.02 | 4.40 | 0.37 | -0.69 | -1.38 | -0.96 | -1.17 | -1.28 | 0.10 | 9.23 |
SOURCES: Author's calculations based on SSA data processing audit reports; and Earnings and Employment Data for Workers Covered Under Social Security and Medicare, by State and County, 2014–2021 editions, Table 5. |
Comparing MGD-Process and ORES Published Estimates of Worker Counts by State
MGD estimates by state differ from those in the Annual Statistical Supplement and Earnings and Employment in one key aspect. In those publications, the “other” state or area category “includes persons employed in American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands; U.S. citizens employed abroad by U.S. employers; persons employed on U.S. oceanborne vessels; and workers with unknown residence.” The MGD process does not separately account for U.S. citizens employed abroad by U.S. employers or persons employed on U.S. oceanborne vessels. However, the MGD files allow separate estimates for American Samoa, Guam, the Northern Mariana Islands, and the U.S. Virgin Islands because the files contain data for the full population of workers in a given tax year as opposed to the 1 percent sample of SSNs that constitute the CWHS.
Table 12 presents the estimated numbers of Medicare-covered workers by state and other area from Annual Statistical Supplement Table 4.B12 for 2014–2021. Table 13 repeats Table 12 for the MGD estimates and Table 14 shows the percentage differences between the published and MGD estimates. The estimates differed by 3 percent or more in at least 1 year for the District of Columbia, Puerto Rico, and 15 states: Alaska, Arkansas, Connecticut, Delaware, Idaho, Maine, Montana, Nebraska, North Dakota, Oklahoma, Rhode Island, South Dakota, Vermont, West Virginia, and Wyoming (Table 14). These jurisdictions have relatively small work forces. The percentage differences are relatively great across most years for Montana, North Dakota, and South Dakota; but in general, these results appear to be reasonably comparable.
State or area a | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 |
---|---|---|---|---|---|---|---|---|
Alabama | 2,360,000 | 2,407,000 | 2,427,000 | 2,446,000 | 2,481,000 | 2,500,000 | 2,493,000 | 2,540,000 |
Alaska | 430,000 | 436,000 | 433,000 | 420,000 | 426,000 | 427,000 | 417,000 | 423,000 |
Arizona | 3,198,000 | 3,277,000 | 3,385,000 | 3,467,000 | 3,561,000 | 3,621,000 | 3,653,000 | 3,754,000 |
Arkansas | 1,472,000 | 1,488,000 | 1,503,000 | 1,510,000 | 1,525,000 | 1,530,000 | 1,521,000 | 1,542,000 |
California | 19,181,000 | 19,707,000 | 20,123,000 | 20,379,000 | 20,685,000 | 20,833,000 | 20,382,000 | 20,407,000 |
Colorado | 2,982,000 | 3,082,000 | 3,158,000 | 3,215,000 | 3,282,000 | 3,340,000 | 3,306,000 | 3,369,000 |
Connecticut | 2,039,000 | 2,050,000 | 2,061,000 | 2,059,000 | 2,071,000 | 2,065,000 | 2,033,000 | 2,046,000 |
Delaware | 513,000 | 517,000 | 525,000 | 534,000 | 536,000 | 541,000 | 537,000 | 546,000 |
District of Columbia | 403,000 | 413,000 | 426,000 | 430,000 | 439,000 | 442,000 | 411,000 | 410,000 |
Florida | 9,857,000 | 10,207,000 | 10,497,000 | 10,758,000 | 11,019,000 | 11,170,000 | 11,240,000 | 11,560,000 |
Georgia | 5,090,000 | 5,219,000 | 5,382,000 | 5,484,000 | 5,588,000 | 5,664,000 | 5,659,000 | 5,803,000 |
Hawaii | 783,000 | 792,000 | 802,000 | 803,000 | 806,000 | 808,000 | 766,000 | 756,000 |
Idaho | 836,000 | 867,000 | 885,000 | 916,000 | 991,000 | 1,031,000 | 1,045,000 | 1,087,000 |
Illinois | 6,968,000 | 7,012,000 | 7,070,000 | 7,070,000 | 7,119,000 | 7,090,000 | 6,914,000 | 6,921,000 |
Indiana | 3,639,000 | 3,663,000 | 3,698,000 | 3,743,000 | 3,793,000 | 3,808,000 | 3,777,000 | 3,804,000 |
Iowa | 1,786,000 | 1,798,000 | 1,807,000 | 1,814,000 | 1,830,000 | 1,837,000 | 1,816,000 | 1,826,000 |
Kansas | 1,600,000 | 1,613,000 | 1,623,000 | 1,621,000 | 1,635,000 | 1,649,000 | 1,629,000 | 1,634,000 |
Kentucky | 2,271,000 | 2,296,000 | 2,317,000 | 2,329,000 | 2,343,000 | 2,355,000 | 2,322,000 | 2,343,000 |
Louisiana | 2,396,000 | 2,407,000 | 2,394,000 | 2,384,000 | 2,406,000 | 2,408,000 | 2,375,000 | 2,353,000 |
Maine | 764,000 | 757,000 | 766,000 | 771,000 | 781,000 | 776,000 | 765,000 | 773,000 |
Maryland | 3,341,000 | 3,377,000 | 3,415,000 | 3,426,000 | 3,467,000 | 3,465,000 | 3,399,000 | 3,431,000 |
Massachusetts | 3,874,000 | 3,930,000 | 3,990,000 | 4,029,000 | 4,061,000 | 4,095,000 | 3,986,000 | 3,994,000 |
Michigan | 5,176,000 | 5,229,000 | 5,321,000 | 5,357,000 | 5,428,000 | 5,394,000 | 5,305,000 | 5,325,000 |
Minnesota | 3,233,000 | 3,266,000 | 3,313,000 | 3,342,000 | 3,371,000 | 3,381,000 | 3,317,000 | 3,319,000 |
Mississippi | 1,443,000 | 1,456,000 | 1,467,000 | 1,469,000 | 1,483,000 | 1,492,000 | 1,472,000 | 1,484,000 |
Missouri | 3,200,000 | 3,250,000 | 3,297,000 | 3,315,000 | 3,335,000 | 3,349,000 | 3,315,000 | 3,334,000 |
Montana | 575,000 | 586,000 | 620,000 | 655,000 | 662,000 | 654,000 | 654,000 | 676,000 |
Nebraska | 1,127,000 | 1,140,000 | 1,166,000 | 1,160,000 | 1,173,000 | 1,168,000 | 1,155,000 | 1,172,000 |
Nevada | 1,387,000 | 1,439,000 | 1,487,000 | 1,521,000 | 1,592,000 | 1,628,000 | 1,638,000 | 1,677,000 |
New Hampshire | 814,000 | 827,000 | 829,000 | 835,000 | 843,000 | 846,000 | 844,000 | 853,000 |
New Jersey | 4,910,000 | 4,964,000 | 5,036,000 | 5,079,000 | 5,154,000 | 5,175,000 | 5,067,000 | 5,058,000 |
New Mexico | 983,000 | 987,000 | 1,007,000 | 1,006,000 | 1,023,000 | 1,023,000 | 1,001,000 | 1,015,000 |
New York | 10,550,000 | 10,678,000 | 10,790,000 | 10,893,000 | 10,983,000 | 11,004,000 | 10,514,000 | 10,420,000 |
North Carolina | 5,066,000 | 5,175,000 | 5,294,000 | 5,385,000 | 5,493,000 | 5,576,000 | 5,555,000 | 5,696,000 |
North Dakota | 501,000 | 480,000 | 466,000 | 463,000 | 463,000 | 465,000 | 447,000 | 446,000 |
Ohio | 6,319,000 | 6,384,000 | 6,448,000 | 6,482,000 | 6,524,000 | 6,558,000 | 6,458,000 | 6,486,000 |
Oklahoma | 2,011,000 | 2,025,000 | 2,017,000 | 2,016,000 | 2,038,000 | 2,054,000 | 2,038,000 | 2,053,000 |
Oregon | 2,057,000 | 2,123,000 | 2,179,000 | 2,220,000 | 2,257,000 | 2,276,000 | 2,228,000 | 2,249,000 |
Pennsylvania | 6,919,000 | 6,980,000 | 7,075,000 | 7,068,000 | 7,119,000 | 7,177,000 | 7,012,000 | 6,998,000 |
Rhode Island | 607,000 | 614,000 | 621,000 | 619,000 | 623,000 | 627,000 | 618,000 | 621,000 |
South Carolina | 2,416,000 | 2,488,000 | 2,559,000 | 2,607,000 | 2,657,000 | 2,695,000 | 2,693,000 | 2,762,000 |
South Dakota | 636,000 | 583,000 | 587,000 | 582,000 | 592,000 | 580,000 | 566,000 | 577,000 |
Tennessee | 3,394,000 | 3,479,000 | 3,556,000 | 3,599,000 | 3,651,000 | 3,679,000 | 3,717,000 | 3,810,000 |
Texas | 13,797,000 | 14,122,000 | 14,326,000 | 14,571,000 | 14,914,000 | 15,171,000 | 15,159,000 | 15,547,000 |
Utah | 1,517,000 | 1,572,000 | 1,629,000 | 1,677,000 | 1,731,000 | 1,774,000 | 1,801,000 | 1,862,000 |
Vermont | 388,000 | 390,000 | 393,000 | 389,000 | 395,000 | 389,000 | 380,000 | 380,000 |
Virginia | 4,558,000 | 4,625,000 | 4,684,000 | 4,721,000 | 4,778,000 | 4,827,000 | 4,769,000 | 4,803,000 |
Washington | 3,754,000 | 3,889,000 | 3,991,000 | 4,069,000 | 4,164,000 | 4,217,000 | 4,149,000 | 4,180,000 |
West Virginia | 894,000 | 891,000 | 875,000 | 861,000 | 863,000 | 858,000 | 844,000 | 851,000 |
Wisconsin | 3,303,000 | 3,332,000 | 3,355,000 | 3,366,000 | 3,393,000 | 3,388,000 | 3,331,000 | 3,350,000 |
Wyoming | 345,000 | 340,000 | 336,000 | 331,000 | 339,000 | 369,000 | 360,000 | 367,000 |
Outlying areas | ||||||||
Puerto Rico | 1,135,000 | 1,120,000 | 1,192,000 | 1,102,000 | 1,095,000 | 1,052,000 | 1,044,000 | 1,089,000 |
Other and unknown b | 893,000 | 621,000 | 613,000 | 594,000 | 607,000 | 597,000 | 594,000 | 574,000 |
SOURCE: Annual Statistical Supplement to the Social Security Bulletin, 2014–2021 editions, Table 4.B12. | ||||||||
a. Most state assignments are based on end-of-year residence obtained from electronically filed employer wage reports; the remainder are based on location of employer from reports filed on paper. | ||||||||
b. Persons employed in American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands; U.S. citizens employed abroad by U.S. employers; persons employed on U.S. oceanborne vessels; and workers with unknown residence. |
State or area a | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 |
---|---|---|---|---|---|---|---|---|
Alabama | 2,325,796 | 2,388,310 | 2,414,672 | 2,434,399 | 2,462,146 | 2,488,784 | 2,485,839 | 2,520,564 |
Alaska | 409,229 | 424,687 | 419,867 | 413,726 | 412,313 | 413,536 | 404,175 | 408,481 |
Arizona | 3,151,122 | 3,284,845 | 3,367,831 | 3,440,879 | 3,526,956 | 3,609,207 | 3,655,131 | 3,743,768 |
Arkansas | 1,426,698 | 1,457,336 | 1,472,578 | 1,481,988 | 1,492,450 | 1,496,868 | 1,497,510 | 1,523,438 |
California | 19,268,475 | 20,068,848 | 20,560,184 | 20,831,971 | 21,088,252 | 21,215,640 | 20,846,927 | 20,870,865 |
Colorado | 3,003,446 | 3,083,579 | 3,155,257 | 3,221,144 | 3,284,322 | 3,342,758 | 3,323,765 | 3,364,421 |
Connecticut | 1,955,257 | 2,001,994 | 1,960,979 | 2,010,990 | 2,016,393 | 2,014,976 | 1,996,837 | 2,005,521 |
Delaware | 496,943 | 513,128 | 521,510 | 526,442 | 532,920 | 539,111 | 539,733 | 549,625 |
District of Columbia | 387,445 | 408,389 | 415,582 | 421,294 | 424,089 | 428,870 | 405,298 | 405,590 |
Florida | 9,701,177 | 10,197,466 | 10,487,000 | 10,742,956 | 10,988,928 | 11,152,676 | 11,265,445 | 11,566,252 |
Georgia | 4,988,221 | 5,186,872 | 5,312,128 | 5,424,421 | 5,535,555 | 5,612,552 | 5,641,376 | 5,783,500 |
Hawaii | 764,111 | 780,996 | 787,986 | 794,504 | 796,910 | 792,340 | 754,105 | 754,442 |
Idaho | 830,431 | 860,575 | 885,842 | 911,347 | 936,980 | 959,045 | 983,073 | 1,020,273 |
Illinois | 6,817,223 | 6,958,524 | 6,989,083 | 7,021,226 | 7,043,323 | 7,040,857 | 6,894,660 | 6,894,041 |
Indiana | 3,564,373 | 3,665,311 | 3,704,413 | 3,748,290 | 3,779,379 | 3,801,776 | 3,791,005 | 3,814,817 |
Iowa | 1,742,904 | 1,783,690 | 1,793,124 | 1,801,410 | 1,805,117 | 1,805,893 | 1,792,534 | 1,801,127 |
Kansas | 1,585,286 | 1,613,484 | 1,620,607 | 1,625,200 | 1,633,960 | 1,640,746 | 1,629,095 | 1,636,034 |
Kentucky | 2,209,114 | 2,265,838 | 2,289,107 | 2,302,804 | 2,309,635 | 2,322,685 | 2,306,053 | 2,330,616 |
Louisiana | 2,338,113 | 2,372,571 | 2,361,671 | 2,350,280 | 2,361,527 | 2,369,296 | 2,333,271 | 2,324,887 |
Maine | 738,007 | 753,540 | 764,336 | 767,413 | 773,436 | 774,008 | 768,248 | 779,719 |
Maryland | 3,268,482 | 3,350,170 | 3,388,995 | 3,408,262 | 3,434,842 | 3,436,965 | 3,391,726 | 3,408,216 |
Massachusetts | 3,845,797 | 3,944,070 | 4,000,144 | 4,041,517 | 4,078,208 | 4,100,804 | 3,991,073 | 4,004,184 |
Michigan | 5,058,225 | 5,169,787 | 5,244,489 | 5,285,148 | 5,378,137 | 5,343,726 | 5,285,932 | 5,286,100 |
Minnesota | 3,167,658 | 3,250,576 | 3,297,866 | 3,326,133 | 3,355,138 | 3,367,108 | 3,327,763 | 3,320,767 |
Mississippi | 1,413,954 | 1,451,939 | 1,467,784 | 1,471,644 | 1,476,733 | 1,486,130 | 1,477,402 | 1,490,951 |
Missouri | 3,183,600 | 3,261,488 | 3,303,758 | 3,327,867 | 3,340,359 | 3,350,259 | 3,342,883 | 3,361,579 |
Montana | 554,124 | 573,446 | 580,665 | 585,446 | 587,780 | 593,043 | 598,583 | 612,845 |
Nebraska | 1,081,944 | 1,107,629 | 1,116,569 | 1,124,763 | 1,128,910 | 1,134,434 | 1,129,986 | 1,137,561 |
Nevada | 1,414,026 | 1,465,951 | 1,512,861 | 1,561,380 | 1,615,402 | 1,655,986 | 1,659,852 | 1,695,859 |
New Hampshire | 796,267 | 817,153 | 825,705 | 833,831 | 839,972 | 843,405 | 838,480 | 844,784 |
New Jersey | 4,852,733 | 4,966,950 | 5,034,195 | 5,095,263 | 5,150,877 | 5,185,547 | 5,110,033 | 5,126,751 |
New Mexico | 989,627 | 1,006,490 | 1,008,501 | 1,009,091 | 1,020,542 | 1,029,587 | 1,010,825 | 1,020,516 |
New York | 10,416,133 | 10,678,873 | 10,806,679 | 10,922,115 | 11,003,466 | 11,053,935 | 10,643,127 | 10,554,208 |
North Carolina | 4,966,546 | 5,115,446 | 5,240,242 | 5,337,426 | 5,435,448 | 5,523,019 | 5,542,468 | 5,661,980 |
North Dakota | 453,455 | 468,336 | 458,778 | 459,281 | 458,714 | 461,738 | 455,750 | 456,224 |
Ohio | 6,166,402 | 6,294,066 | 6,343,122 | 6,386,521 | 6,424,644 | 6,447,439 | 6,379,076 | 6,396,538 |
Oklahoma | 1,951,098 | 1,987,046 | 1,973,074 | 1,982,869 | 1,999,128 | 2,015,568 | 2,012,617 | 2,034,754 |
Oregon | 2,020,505 | 2,126,093 | 2,180,873 | 2,224,536 | 2,258,054 | 2,281,035 | 2,255,693 | 2,263,442 |
Pennsylvania | 6,777,747 | 6,899,769 | 6,980,708 | 6,995,321 | 7,045,836 | 7,101,282 | 6,972,920 | 6,965,525 |
Rhode Island | 580,233 | 593,221 | 601,430 | 607,022 | 613,885 | 614,376 | 610,396 | 612,323 |
South Carolina | 2,379,884 | 2,477,374 | 2,540,407 | 2,587,617 | 2,640,018 | 2,679,565 | 2,685,369 | 2,746,858 |
South Dakota | 490,019 | 502,293 | 506,651 | 509,830 | 511,439 | 514,511 | 514,823 | 526,912 |
Tennessee | 3,302,681 | 3,429,108 | 3,502,364 | 3,554,071 | 3,607,847 | 3,645,676 | 3,692,406 | 3,777,793 |
Texas | 13,622,711 | 14,122,676 | 14,350,175 | 14,593,561 | 14,944,151 | 15,234,809 | 15,293,679 | 15,679,499 |
Utah | 1,512,227 | 1,583,771 | 1,633,174 | 1,683,701 | 1,733,986 | 1,778,954 | 1,812,766 | 1,871,433 |
Vermont | 368,187 | 374,871 | 376,491 | 377,378 | 377,200 | 376,241 | 371,688 | 373,695 |
Virginia | 4,434,859 | 4,611,706 | 4,675,327 | 4,724,376 | 4,771,246 | 4,818,371 | 4,783,932 | 4,821,379 |
Washington | 3,721,972 | 3,890,208 | 3,991,527 | 4,087,639 | 4,161,543 | 4,235,745 | 4,177,973 | 4,219,439 |
West Virginia | 866,120 | 878,925 | 868,930 | 861,506 | 860,471 | 859,274 | 847,043 | 849,965 |
Wisconsin | 3,246,057 | 3,324,258 | 3,351,792 | 3,374,284 | 3,395,063 | 3,397,352 | 3,358,322 | 3,368,438 |
Wyoming | 333,544 | 338,126 | 328,942 | 323,232 | 325,533 | 328,489 | 325,573 | 326,765 |
Outlying areas | ||||||||
Puerto Rico | 1,038,677 | 1,060,490 | 1,054,463 | 1,033,362 | 982,188 | 958,477 | 922,869 | 1,021,089 |
Other | 1,895,607 | 342,370 | 337,686 | 342,774 | 366,392 | 368,351 | 361,849 | 377,459 |
American Samoa | 10,086 | 10,806 | 13,646 | 12,282 | 12,204 | 12,004 | 8,858 | 11,244 |
Guam | 75,418 | 76,774 | 81,135 | 80,393 | 80,727 | 82,070 | 79,603 | 77,792 |
Northern Mariana Islands | 9,113 | 13,319 | 15,362 | 19,802 | 13,774 | 14,309 | 13,328 | 15,719 |
U.S. Virgin Islands | 43,039 | 43,197 | 43,928 | 35,418 | 40,115 | 39,815 | 38,389 | 42,342 |
Unknown residence | 1,757,951 | 198,274 | 183,615 | 194,879 | 219,572 | 220,153 | 221,671 | 230,362 |
SOURCE: Author's calculations based on MGD files. | ||||||||
a. Most state assignments are based on end-of-year residence obtained from electronically filed employer wage reports; the remainder are based on location of employer from reports filed on paper. |
State or area a | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 |
---|---|---|---|---|---|---|---|---|
Alabama | -1.47 | -0.78 | -0.51 | -0.48 | -0.77 | -0.45 | -0.29 | -0.77 |
Alaska | -5.08 | -2.66 | -3.13 | -1.52 | -3.32 | -3.26 | -3.17 | -3.55 |
Arizona | -1.49 | 0.24 | -0.51 | -0.76 | -0.97 | -0.33 | 0.06 | -0.27 |
Arkansas | -3.18 | -2.10 | -2.07 | -1.89 | -2.18 | -2.21 | -1.57 | -1.22 |
California | 0.45 | 1.80 | 2.13 | 2.17 | 1.91 | 1.80 | 2.23 | 2.22 |
Colorado | 0.71 | 0.05 | -0.09 | 0.19 | 0.07 | 0.08 | 0.53 | -0.14 |
Connecticut | -4.28 | -2.40 | -5.10 | -2.39 | -2.71 | -2.48 | -1.81 | -2.02 |
Delaware | -3.23 | -0.75 | -0.67 | -1.44 | -0.58 | -0.35 | 0.51 | 0.66 |
District of Columbia | -4.01 | -1.13 | -2.51 | -2.07 | -3.52 | -3.06 | -1.41 | -1.09 |
Florida | -1.61 | -0.09 | -0.10 | -0.14 | -0.27 | -0.16 | 0.23 | 0.05 |
Georgia | -2.04 | -0.62 | -1.32 | -1.10 | -0.95 | -0.92 | -0.31 | -0.34 |
Hawaii | -2.47 | -1.41 | -1.78 | -1.07 | -1.14 | -1.98 | -1.58 | -0.21 |
Idaho | -0.67 | -0.75 | 0.10 | -0.51 | -5.77 | -7.50 | -6.30 | -6.54 |
Illinois | -2.21 | -0.77 | -1.16 | -0.69 | -1.07 | -0.70 | -0.28 | -0.39 |
Indiana | -2.09 | 0.06 | 0.17 | 0.14 | -0.36 | -0.16 | 0.37 | 0.28 |
Iowa | -2.47 | -0.80 | -0.77 | -0.70 | -1.38 | -1.72 | -1.31 | -1.38 |
Kansas | -0.93 | 0.03 | -0.15 | 0.26 | -0.06 | -0.50 | 0.01 | 0.12 |
Kentucky | -2.80 | -1.33 | -1.22 | -1.14 | -1.44 | -1.39 | -0.69 | -0.53 |
Louisiana | -2.48 | -1.45 | -1.37 | -1.43 | -1.88 | -1.63 | -1.79 | -1.21 |
Maine | -3.52 | -0.46 | -0.22 | -0.47 | -0.98 | -0.26 | 0.42 | 0.86 |
Maryland | -2.22 | -0.80 | -0.77 | -0.52 | -0.94 | -0.82 | -0.21 | -0.67 |
Massachusetts | -0.73 | 0.36 | 0.25 | 0.31 | 0.42 | 0.14 | 0.13 | 0.25 |
Michigan | -2.33 | -1.15 | -1.46 | -1.36 | -0.93 | -0.94 | -0.36 | -0.74 |
Minnesota | -2.06 | -0.47 | -0.46 | -0.48 | -0.47 | -0.41 | 0.32 | 0.05 |
Mississippi | -2.05 | -0.28 | 0.05 | 0.18 | -0.42 | -0.39 | 0.37 | 0.47 |
Missouri | -0.52 | 0.35 | 0.20 | 0.39 | 0.16 | 0.04 | 0.83 | 0.82 |
Montana | -3.77 | -2.19 | -6.77 | -11.88 | -12.63 | -10.28 | -9.26 | -10.31 |
Nebraska | -4.16 | -2.92 | -4.43 | -3.13 | -3.91 | -2.96 | -2.21 | -3.03 |
Nevada | 1.91 | 1.84 | 1.71 | 2.59 | 1.45 | 1.69 | 1.32 | 1.11 |
New Hampshire | -2.23 | -1.21 | -0.40 | -0.14 | -0.36 | -0.31 | -0.66 | -0.97 |
New Jersey | -1.18 | 0.06 | -0.04 | 0.32 | -0.06 | 0.20 | 0.84 | 1.34 |
New Mexico | 0.67 | 1.94 | 0.15 | 0.31 | -0.24 | 0.64 | 0.97 | 0.54 |
New York | -1.29 | 0.01 | 0.15 | 0.27 | 0.19 | 0.45 | 1.21 | 1.27 |
North Carolina | -2.00 | -1.16 | -1.03 | -0.89 | -1.06 | -0.96 | -0.23 | -0.60 |
North Dakota | -10.49 | -2.49 | -1.57 | -0.81 | -0.93 | -0.71 | 1.92 | 2.24 |
Ohio | -2.47 | -1.43 | -1.65 | -1.50 | -1.55 | -1.71 | -1.24 | -1.40 |
Oklahoma | -3.07 | -1.91 | -2.23 | -1.67 | -1.94 | -1.91 | -1.26 | -0.90 |
Oregon | -1.81 | 0.15 | 0.09 | 0.20 | 0.05 | 0.22 | 1.23 | 0.64 |
Pennsylvania | -2.08 | -1.16 | -1.35 | -1.04 | -1.04 | -1.07 | -0.56 | -0.47 |
Rhode Island | -4.61 | -3.50 | -3.25 | -1.97 | -1.48 | -2.05 | -1.25 | -1.42 |
South Carolina | -1.52 | -0.43 | -0.73 | -0.75 | -0.64 | -0.58 | -0.28 | -0.55 |
South Dakota | -29.79 | -16.07 | -15.86 | -14.16 | -15.75 | -12.73 | -9.94 | -9.51 |
Tennessee | -2.76 | -1.45 | -1.53 | -1.26 | -1.20 | -0.91 | -0.67 | -0.85 |
Texas | -1.28 | 0.00 | 0.17 | 0.15 | 0.20 | 0.42 | 0.88 | 0.85 |
Utah | -0.32 | 0.74 | 0.26 | 0.40 | 0.17 | 0.28 | 0.65 | 0.50 |
Vermont | -5.38 | -4.04 | -4.38 | -3.08 | -4.72 | -3.39 | -2.24 | -1.69 |
Virginia | -2.78 | -0.29 | -0.19 | 0.07 | -0.14 | -0.18 | 0.31 | 0.38 |
Washington | -0.86 | 0.03 | 0.01 | 0.46 | -0.06 | 0.44 | 0.69 | 0.93 |
West Virginia | -3.22 | -1.37 | -0.70 | 0.06 | -0.29 | 0.15 | 0.36 | -0.12 |
Wisconsin | -1.75 | -0.23 | -0.10 | 0.25 | 0.06 | 0.28 | 0.81 | 0.55 |
Wyoming | -3.43 | -0.55 | -2.15 | -2.40 | -4.14 | -12.33 | -10.57 | -12.31 |
Outlying areas | ||||||||
Puerto Rico | -9.27 | -5.61 | -13.04 | -6.64 | -11.49 | -9.76 | -13.13 | -6.65 |
Other and unknown b, c | 52.89 | -81.38 | -81.53 | -73.29 | -65.67 | -62.07 | -64.16 | -52.07 |
SOURCE: Annual Statistical Supplement to the Social Security Bulletin, 2014–2021 editions, Table 4.B12. | ||||||||
a. Most state assignments are based on end-of-year residence obtained from electronically filed employer wage reports; the remainder are based on location of employer from reports filed on paper. | ||||||||
b. Persons employed in American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands; U.S. citizens employed abroad by U.S. employers; persons employed on U.S. oceanborne vessels; and workers with unknown residence. | ||||||||
c. Compares the "Other and unknown" row in Table 12 with the "Other" row in Table 13. |
Worker Count Estimates by County
Estimating the number of workers by county is much more complicated than preparing state-level estimates given the sheer number of calculations and the data disclosure restrictions that apply for low-population counties. In Table 6 of Earnings and Employment, ORES publishes estimated counts of Medicare-covered workers by state and county. For the 2021 edition, ORES computed estimates for each of the 50 states and Puerto Rico, and for each of the 3,225 U.S. counties or county equivalents represented in the 2021 MGD file, or 3,276 (3,225 plus 51) computations for each of nine categories of workers: all, male, and female earners with any, wage and salary, and self-employment Medicare-covered income. Thus, ORES computed 29,484 worker count estimates (3,276 × 9) for the 2021 edition of Earnings and Employment Table 6.
However, many of those computations were not published. Instead, they were suppressed because of data nondisclosure requirements. Primary cell suppression applies to any estimates based on unweighted counts that do not meet the disclosure threshold. Thus, for any county with fewer total workers than the disclosure threshold, the estimates for all nine categories of workers must be suppressed. Secondary cell suppression arises when an estimated value that is below the disclosure threshold can be inferred based on other estimates, as often occurs with estimates broken down by sex. In other words, if the unweighted counts of either male or female workers in a county do not meet or exceed the disclosure threshold, both estimates must be suppressed. Cell suppression is common for county-level estimates of self-employed individuals by sex, as unweighted counts often do not meet the disclosure thresholds. Secondary cell suppression is thus also required for estimated counts of wage and salary workers in those counties, because if those figures were disclosed, they could be subtracted from the all-workers figures to determine the self-employed individual counts.
Even though this note focuses on the worker count estimates and puts earnings estimates aside, the number of county-level estimates and the complexity of incorporating data nondisclosure procedures precludes the presentation of county-level estimates in this setting. Instead, I focus on a key question posed by developing a new methodology for generating the annual employment and earnings estimates: What sample size best reduces the effect of the data nondisclosure restrictions?
Effect of Data Nondisclosure Requirements
The unweighted number of self-employed individuals in a 1 percent sample of SSNs is too low to generate viable estimates for all U.S. counties. Using the 2021 MGD file, I compare the effects of using a 1 percent sample, a 10 percent sample, or a full population of workers on the number of county-level estimates that must be suppressed to comply with data nondisclosure rules.13
The process begins by assigning a randomly generated number from 1 to 100 to the record of each worker in the MGD file. Records for workers assigned a 1 are selected for the 1 percent sample and those assigned 1–10 are selected for the 10 percent sample. The next step identifies all the workers whose records have a valid SSN, who have an identifier of male or female, and whose indicated age is within 1–99. For these samples I compute two separate county-level worker count estimates for all workers, wage and salary workers, and self-employed individuals. The two estimates are (1) for the total number of workers in the county (that is, regardless of sex) and (2) for the male and female workers in the county (that is, workers by sex).
For county-level estimates of all workers (that is, regardless of earnings type), the data nondisclosure rules are straightforward: If the unweighted count of all workers in the county is below the disclosure threshold, the estimates will be suppressed. However, secondary cell suppression rules must be applied to the estimates for wage and salary workers and self-employed individuals. If the estimated number of wage and salary workers in a county must be suppressed, then the corresponding estimate for self-employed individuals must also be suppressed, and vice versa. Dividing the unweighted counts of workers both by sex and by earnings type multiplies the number of counties affected by the nondisclosure rules.
The total number of workers in the 1 percent sample is 1,801,744 and the total number of U.S. counties in the MGD file for the entire population of workers is 3,225 (Table 15). Two counties are omitted from the 1 percent sample for as-yet unresolved discrepancies in the underlying geographic data. For counts of all workers—combining both earnings types and both sexes—the estimates for 100 counties (or 3.1 percent of U.S. counties) would have to be suppressed in the 1 percent sample. For all wage and salary workers, the estimates for 111 counties (3.4 percent of U.S. counties) would have to be suppressed. For all self-employed individuals, however, the estimates for significantly more counties—1,170, or 36.3 percent of U.S. counties—would need to be suppressed. Thus, given secondary cell suppression rules, using the 1 percent sample would require the corresponding estimates for wage and salary workers also to be suppressed.
Measure | All workers a | Wage and salary workers | Self-employed individuals | |||
---|---|---|---|---|---|---|
Number | Percent | Number | Percent | Number | Percent | |
Total U.S. counties | 3,225 | 100.0 | 3,225 | 100.0 | 3,225 | 100.0 |
1 percent sample (similar to that currently used for statistical publications) | ||||||
Counties in sample | 3,223 | 100.0 | 3,222 | 100.0 | 3,164 | 100.0 |
With total worker estimates (men and women combined)— | ||||||
Published | 3,123 | 96.9 | 3,111 | 96.6 | 1,994 | 63.0 |
Suppressed b | 100 | 3.1 | 111 | 3.4 | 1,170 | 36.3 |
With worker estimates by sex (men only or women only)— | ||||||
Published | 2,961 | 91.9 | 2,929 | 90.9 | 1,316 | 41.6 |
Suppressed b | 262 | 8.1 | 293 | 9.1 | 1,848 | 58.4 |
Workers represented | 1,801,744 | . . . | 1,693,573 | . . . | 207,582 | . . . |
10 percent sample (option available in MGD process) | ||||||
Counties in sample | 3,225 | 100.0 | 3,225 | 100.0 | 3,224 | 100.0 |
With total worker estimates (men and women combined)— | ||||||
Published | 3,223 | 99.9 | 3,223 | 99.9 | 3,187 | 98.9 |
Suppressed b | 2 | 0.1 | 2 | 0.1 | 37 | 1.1 |
With worker estimates by sex (men only or women only)— | ||||||
Published | 3,221 | 99.9 | 3,219 | 99.8 | 3,087 | 95.8 |
Suppressed b | 4 | 0.1 | 6 | 0.2 | 137 | 4.2 |
Workers represented | 18,015,649 | . . . | 16,923,540 | . . . | 2,080,712 | . . . |
Full worker population (option available in MGD process) | ||||||
Counties in sample | 3,225 | 100.0 | 3,225 | 100.0 | 3,224 | 100.0 |
With total worker estimates (men and women combined)— | ||||||
Published | 3,225 | 100.0 | 3,223 | 99.9 | 3,223 | 100.0 |
Suppressed b | 0 | 0.0 | 2 | 0.1 | 1 | 0.0 |
With worker estimates by sex (men only or women only)— | ||||||
Published | . | 100.0 | 3,219 | 99.8 | 3,221 | 99.9 |
Suppressed b | 0 | 0.0 | 6 | 0.2 | 3 | 0.1 |
Workers represented | 180,166,715 | . . . | 169,341,674 | . . . | 20,805,637 | . . . |
SOURCE: Author's calculations using the 2021 MGD file. | ||||||
NOTES: Includes U.S. territories.
. . . = not applicable.
|
||||||
a. Workers with earnings from both wage and salary employment and self-employment are counted in each type of earnings but only once in the total. | ||||||
b. Values are for primary suppression only. Because of secondary suppression requirements, the actual number of suppressed estimates for both wage and salary workers and self-employed individuals would be equal to the higher of those two values. |
When breaking down the estimated worker counts by sex, the number of counties requiring cell suppression increases significantly across all three earnings-type categories. For the counts of all male workers and of all female workers, the estimates for 8.1 percent of counties would need to be suppressed. For male and female wage and salary workers, the estimates for 9.1 percent of U.S. counties would require suppression. However, in computing separate estimated numbers of self-employed individuals by sex, the estimates for a majority (58.4 percent) of U.S. counties would require suppression, which in turn would require the secondary suppression of the corresponding cells for wage and salary workers by sex in those counties. Because the 2021 edition of Earnings and Employment also used a 1 percent sample as the basis of its estimates, the suppression rate for the published tables was close to 60 percent.
The 10 percent sample of the MGD file contains records for 18,015,649 workers. The ten-fold increase in the sample size dramatically reduces the number of county-level estimates that require suppression. For counts of all workers—combining both earnings types and both sexes—the number of counties requiring suppressed estimates decreases from 100 for the 1 percent sample to 2 for the 10 percent sample. For the counts of all workers—combing both earnings types—with breakdowns by sex, the number of counties requiring suppressed estimates decreases from 262 to 4. For wage and salary workers, the number of county-level estimates of combined male and female workers that must be suppressed decreases from 111 to 2 and the number of estimates by sex decreases from 293 to 6. In moving from the 1 percent sample to the 10 percent sample, the largest decrease in cell suppression occurs for self-employed individuals. For both sexes combined, the number of suppressed county-level estimates drops from 1,170 to 37 and for estimates by sex, it drops from 1,848 to 137. For estimates by earnings type, the percentage of county-level total-worker estimates for men and women combined that would require suppression would drop from 36.3 percent to 1.1 percent. For the same estimates broken down by sex, the suppression rate would decrease from 58.4 percent to 4.2 percent.
The full worker population MGD file for tax year 2021 contains records for 180,166,715 workers. Using the full population of workers in the MGD file would limit the number of county-level worker count estimates needing to be suppressed to 6 in the entire Earnings and Employment publication.
Potential Addition of Maps and New Tables to the Statistical Publications
In addition to replacing the CWHS 1 percent sample with the MGD file for the entire population of workers for its statistical publications, ORES is considering the addition of charts to provide visualizations of geographic earnings and employment distributions and new tables to provide further insights on the U.S. labor force.
To illustrate how maps could enhance the content of the publications, Chart 1 provides a graphic presentation of the statistics shown in Table 15, with separate panels for each MGD sample size. Although Table 15 showed that the estimates for 58 percent of the U.S. counties require suppression under a 1 percent sample, Panel A provides a visualization that highlights the prevalence and the geographic patterns of the suppression. Panel B shows that replacing the 1 percent sample with a 10 percent sample would dramatically reduce the number of suppressed county-level estimates. Panel C reveals that using the 2021 MGD full population of workers would allow the removal of nearly all county-level publication restrictions. These maps vividly display the stark contrasts in cell suppression between the three sample sizes.
Chart 2 shows states and counties grouped by worker population-size quintiles. The maps provide a visual perspective on the geographic distribution of worker counts that the statistical tables cannot provide. Chart 2 features quintiles only to illustrate how maps could contribute to ORES presentation of statistical data. The final version of this sort of map might provide alternative representations of worker counts (such as quartiles or deciles) with which to highlight the key differences across states and counties.
ORES is also considering adding maps for each state that would provide county-level detail. Chart 3 presents three examples using worker-count quintile groupings. Note that where Chart 2 arranged states and counties by quintile according to their national rankings, each Chart 3 panel arranges the counties by their quintile rankings within their state.
The MGD file also enables ORES to consider adding new tables on earnings and employment to the existing annual statistical publications. For example, Table 16 presents the percentage distribution of workers by sex in each tax year 2014–2021, shown separately for all, wage and salary, and self-employed individuals. Among all workers, the share who are women increased slightly, from 48.4 percent to 48.8 percent, from tax years 2014 to 2021. The share of wage and salary workers who are women likewise rose, from 48.8 percent to 49.1 percent. By far, the largest increase in the female share of workers occurred among the self-employed, from 43.6 percent in 2014 to 45.2 percent in 2021. Yet despite this increase, men represent a disproportionate share of self-employed individuals, especially relative to the split for wage and salary workers.
Tax year | Total | Men | Women | |||
---|---|---|---|---|---|---|
Number | Percent | Number | Percent | Number | Percent | |
All workers a | ||||||
2014 | 167,874,533 | 100.0 | 86,554,357 | 51.6 | 81,320,176 | 48.4 |
2015 | 171,534,731 | 100.0 | 88,462,230 | 51.6 | 83,072,501 | 48.4 |
2016 | 174,168,264 | 100.0 | 89,664,098 | 51.5 | 84,504,166 | 48.5 |
2017 | 176,311,525 | 100.0 | 90,686,951 | 51.4 | 85,624,574 | 48.6 |
2018 | 178,527,817 | 100.0 | 91,677,783 | 51.4 | 86,850,034 | 48.6 |
2019 | 180,042,880 | 100.0 | 92,247,594 | 51.2 | 87,795,286 | 48.8 |
2020 | 178,499,006 | 100.0 | 91,332,698 | 51.2 | 87,166,308 | 48.8 |
2021 | 180,313,849 | 100.0 | 92,260,151 | 51.2 | 88,053,698 | 48.8 |
Wage and salary | ||||||
2014 | 158,682,500 | 100.0 | 81,255,302 | 51.2 | 77,427,198 | 48.8 |
2015 | 161,480,185 | 100.0 | 82,664,297 | 51.2 | 78,815,888 | 48.8 |
2016 | 164,171,152 | 100.0 | 83,907,441 | 51.1 | 80,263,711 | 48.9 |
2017 | 166,165,895 | 100.0 | 84,850,288 | 51.1 | 81,315,607 | 48.9 |
2018 | 168,335,976 | 100.0 | 85,830,038 | 51.0 | 82,505,938 | 49.0 |
2019 | 170,228,900 | 100.0 | 86,672,090 | 50.9 | 83,556,810 | 49.1 |
2020 | 168,662,331 | 100.0 | 85,777,999 | 50.9 | 82,884,332 | 49.1 |
2021 | 169,486,721 | 100.0 | 86,192,409 | 50.9 | 83,294,312 | 49.1 |
Self-employed | ||||||
2014 | 17,284,289 | 100.0 | 9,753,284 | 56.4 | 7,531,005 | 43.6 |
2015 | 19,068,333 | 100.0 | 10,762,981 | 56.4 | 8,305,352 | 43.6 |
2016 | 19,193,168 | 100.0 | 10,807,316 | 56.3 | 8,385,852 | 43.7 |
2017 | 19,432,607 | 100.0 | 10,898,755 | 56.1 | 8,533,852 | 43.9 |
2018 | 19,628,293 | 100.0 | 10,945,950 | 55.8 | 8,682,343 | 44.2 |
2019 | 19,028,561 | 100.0 | 10,496,410 | 55.2 | 8,532,151 | 44.8 |
2020 | 18,552,136 | 100.0 | 10,230,550 | 55.1 | 8,321,586 | 44.9 |
2021 | 20,808,898 | 100.0 | 11,393,362 | 54.8 | 9,415,536 | 45.2 |
SOURCE: Author's calculations using the 2021 MGD file. | ||||||
a. Workers with earnings from both wage and salary employment and self-employment are counted in each type of earnings but only once under "all workers." |
ORES may also add Table 17 to one of its annual publications. It highlights the year-over-year changes in worker counts by sex and earnings type for tax years 2014 to 2021. The apparent increase from 2014 to 2015 was larger for both sexes and both earnings types than those for every other year (except self-employed individuals from 2020 to 2021). This may reflect an irregularity with the tax year 2014 MGD file. The rates of increase in the numbers of all workers and wage and salary workers for 2015–2021 seem to be reasonable. The relatively large decreases in the numbers self-employed individuals in 2019 and 2020 suggest that COVID-19 had a larger effect on the self-employed than on wage and salary workers. Alternatively, it may be that the pandemic's disruption of IRS workflows, which caused a backlog for about 2 years, disproportionately affected Schedule SE tax returns. Thus, the sharp increase in the count of self-employed individuals in 2021, after 2 years of declines, seemingly indicates that the IRS succeeded in reducing much of the Schedule SE backlog in 2021.
Tax year | Total | Men | Women | |||
---|---|---|---|---|---|---|
Number | Percent change from previous year | Number | Percent change from previous year | Number | Percent change from previous year | |
All workers a | ||||||
2014 | 167,874,533 | . . . | 86,554,357 | . . . | 81,320,176 | . . . |
2015 | 171,534,731 | 2.2 | 88,462,230 | 2.2 | 83,072,501 | 2.2 |
2016 | 174,168,264 | 1.5 | 89,664,098 | 1.4 | 84,504,166 | 1.7 |
2017 | 176,311,525 | 1.2 | 90,686,951 | 1.1 | 85,624,574 | 1.3 |
2018 | 178,527,817 | 1.3 | 91,677,783 | 1.1 | 86,850,034 | 1.4 |
2019 | 180,042,880 | 0.8 | 92,247,594 | 0.6 | 87,795,286 | 1.1 |
2020 | 178,499,006 | -0.9 | 91,332,698 | -1.0 | 87,166,308 | -0.7 |
2021 | 180,313,849 | 1.0 | 92,260,151 | 1.0 | 88,053,698 | 1.0 |
Wage and salary | ||||||
2014 | 158,682,500 | . . . | 81,255,302 | . . . | 77,427,198 | . . . |
2015 | 161,480,185 | 1.8 | 82,664,297 | 1.7 | 78,815,888 | 1.8 |
2016 | 164,171,152 | 1.7 | 83,907,441 | 1.5 | 80,263,711 | 1.8 |
2017 | 166,165,895 | 1.2 | 84,850,288 | 1.1 | 81,315,607 | 1.3 |
2018 | 168,335,976 | 1.3 | 85,830,038 | 1.2 | 82,505,938 | 1.5 |
2019 | 170,228,900 | 1.1 | 86,672,090 | 1.0 | 83,556,810 | 1.3 |
2020 | 168,662,331 | -0.9 | 85,777,999 | -1.0 | 82,884,332 | -0.8 |
2021 | 169,486,721 | 0.5 | 86,192,409 | 0.5 | 83,294,312 | 0.5 |
Self-employed | ||||||
2014 | 17,284,289 | . . . | 9,753,284 | . . . | 7,531,005 | . . . |
2015 | 19,068,333 | 10.3 | 10,762,981 | 10.4 | 8,305,352 | 10.3 |
2016 | 19,193,168 | 0.7 | 10,807,316 | 0.4 | 8,385,852 | 1 |
2017 | 19,432,607 | 1.2 | 10,898,755 | 0.8 | 8,533,852 | 1.8 |
2018 | 19,628,293 | 1.0 | 10,945,950 | 0.4 | 8,682,343 | 1.7 |
2019 | 19,028,561 | -3.1 | 10,496,410 | -4.1 | 8,532,151 | -1.7 |
2020 | 18,552,136 | -2.5 | 10,230,550 | -2.5 | 8,321,586 | -2.5 |
2021 | 20,808,898 | 12.2 | 11,393,362 | 11.4 | 9,415,536 | 13.1 |
SOURCE: Author's calculations using the 2021 MGD file. | ||||||
NOTE: . . . = not applicable. | ||||||
a. Workers with earnings from both wage and salary employment and self-employment are counted in each type of earnings but only once under "all workers." |
Additional Detail in Earnings-Type Categories
As noted earlier, the MGD files identify a new earnings-type subcategory. The statistical publications cover two earnings types—wage and salary, and self-employment income—and include separate computations for all workers combined. Because workers may have both type of earnings in a year, the sum of wage and salary workers and self-employed individuals exceeds the all-workers figure in the statistical publications. To alleviate this overlap, the MGD file sorts worker records among three mutually exclusive earnings-type categories: wages and salary only, self-employment income only, and both types of earnings, or so-called combination workers.
Table 18 shows the number and percentage distribution of workers for each of the mutually exclusive earnings-type categories from 2014 to 2021. It shows a slight increase in the percentage of women reporting only wage and salary earnings, from 49.0 percent in 2014 to 49.3 percent in 2021. The increase in the percentage of women reporting self-employment income only is greater, from 42.4 percent in 2014 to 44.0 percent in 2021. The percentage of women with both wage and salary and self-employment income rose from 45.0 percent in 2014 to 46.6 percent in 2021, mirroring the 1.6 percentage-point increase in women with self-employment income only. This table exemplifies the sort of new insights on labor force dynamics that can be added to the existing statistical publications. ORES is exploring similar expansions of coverage by age and state and the addition of tables showing estimated mean and median earnings amounts.
Tax year | Total | Men | Women | |||
---|---|---|---|---|---|---|
Number | Percent | Number | Percent | Number | Percent | |
Wage and salary only | ||||||
2014 | 150,590,244 | 100.0 | 76,801,073 | 51.0 | 73,789,171 | 49.0 |
2015 | 152,466,398 | 100.0 | 77,699,249 | 51.0 | 74,767,149 | 49.0 |
2016 | 154,975,096 | 100.0 | 78,856,782 | 50.9 | 76,118,314 | 49.1 |
2017 | 156,878,918 | 100.0 | 79,788,196 | 50.9 | 77,090,722 | 49.1 |
2018 | 158,899,524 | 100.0 | 80,731,833 | 50.8 | 78,167,691 | 49.2 |
2019 | 161,014,319 | 100.0 | 81,751,184 | 50.8 | 79,263,135 | 49.2 |
2020 | 159,946,870 | 100.0 | 81,102,148 | 50.7 | 78,844,722 | 49.3 |
2021 | 159,504,951 | 100.0 | 80,866,789 | 50.7 | 78,638,162 | 49.3 |
Self-employed only | ||||||
2014 | 9,192,033 | 100.0 | 5,299,055 | 57.6 | 3,892,978 | 42.4 |
2015 | 10,054,546 | 100.0 | 5,797,933 | 57.7 | 4,256,613 | 42.3 |
2016 | 9,997,112 | 100.0 | 5,756,657 | 57.6 | 4,240,455 | 42.4 |
2017 | 10,145,630 | 100.0 | 5,836,663 | 57.5 | 4,308,967 | 42.5 |
2018 | 10,191,841 | 100.0 | 5,847,745 | 57.4 | 4,344,096 | 42.6 |
2019 | 9,813,980 | 100.0 | 5,575,504 | 56.8 | 4,238,476 | 43.2 |
2020 | 9,836,675 | 100.0 | 5,554,699 | 56.5 | 4,281,976 | 43.5 |
2021 | 10,827,128 | 100.0 | 6,067,742 | 56.0 | 4,759,386 | 44.0 |
Combination | ||||||
2014 | 8,092,256 | 100.0 | 4,454,229 | 55.0 | 3,638,027 | 45.0 |
2015 | 9,013,787 | 100.0 | 4,965,048 | 55.1 | 4,048,739 | 44.9 |
2016 | 9,196,056 | 100.0 | 5,050,659 | 54.9 | 4,145,397 | 45.1 |
2017 | 9,286,977 | 100.0 | 5,062,092 | 54.5 | 4,224,885 | 45.5 |
2018 | 9,436,452 | 100.0 | 5,098,205 | 54.0 | 4,338,247 | 46.0 |
2019 | 9,214,581 | 100.0 | 4,920,906 | 53.4 | 4,293,675 | 46.6 |
2020 | 8,715,461 | 100.0 | 4,675,851 | 53.7 | 4,039,610 | 46.3 |
2021 | 9,981,770 | 100.0 | 5,325,620 | 53.4 | 4,656,150 | 46.6 |
SOURCES: Author's calculations using the 2021 MGD file. |
Summary
This note presents preliminary worker count estimates from the 2014 to 2021 MGD files and compares them with two benchmark estimates prepared independently by OCACT in support of the annual Trustees Report and by ORES for inclusion in two of its annual statistical publications. The comparisons of the MGD estimates with the benchmarks are broadly encouraging. The estimated numbers of all workers and of wage and salary workers differ only by small percentages. In addition, the MGD estimates of workers with only wage and salary earnings differed little from the OCACT estimates. However, the MGD estimates differed substantially from both benchmarks for self-employed individuals and from the OCACT estimates for workers with only self-employment earnings and for the “combination workers” with both wage and salary and self-employment earnings. These differences indicate a critical need to incorporate nonprimary tax year data—that is, data from tax forms that are processed more than 1 year after the earnings year—into each tax year's MGD file. This need appears to be particularly important for the years immediately following the COVID-19 pandemic, when the IRS experienced very large processing backlogs. For the most part, the comparisons with published state-level estimates were consistent with previous comparisons for state-level estimates for tax year 2017 (Compson 2024). The largest percentage differences generally occur for states or territories with relatively small work forces.
Comparing the MGD estimates against the benchmarks has uncovered some limitations of the MGD files as currently structured. Specifically, the records for some workers identify a state of residence but contain an unknown value for county of residence. ORES is investigating potential methods of obtaining ZIP Code data that would enable the imputation of a valid SCC for these workers.
The MGD file enables ORES to estimate worker counts using person-level microdata for virtually the entire population of workers. This makes it possible for the statistical publications to present worker counts for several U.S. territories for the first time. ORES is also considering the addition of maps and new tables to the publications.
This note shares the results of these preliminary comparisons with researchers, policy analysts, and staff of the various federal agencies that collect and disseminate U.S. labor market data. ORES welcomes feedback on the MGD methodology for assigning state-of-residence and demographic information to worker records and on the preliminary results presented herein.
Notes
1 The Annual Statistical Supplement is available at https://www.ssa.gov/policy/docs/statcomps/supplement/index.html. Earnings and Employment is available at https://www.ssa.gov/policy/docs/statcomps/eedata_sc/index.html.
2 For the purposes of this note, a worker is defined as any individual who had a tax record processed by SSA or the IRS in a given calendar year.
3 ORES welcomes feedback on the MGD process and the estimates it generates at statistics@ssa.gov.
4 Some jobs are not subject to Social Security payroll taxes but virtually all jobs are subject to the Medicare tax.
5 IRS Form W-2 is the annual wage and tax statement that employers file on behalf of employees. Form W-2c, “Corrected Wage and Tax Statement,” is filed when a worker's original W-2 contained any errors or otherwise needs to be updated.
6 The automated process for assigning SCCs based on addresses on tax forms identified a single SCC for at least 94 percent of worker-level records for tax years 2015–2020 but, for yet unknown reasons, only 34 percent of the records for tax year 2014.
7 In the statistical publications, the wage and salary category includes individuals who have both wage and salary and self-employment income. Likewise, the self-employed category includes workers with both wage and salary and self-employment income. Consequently, some workers are counted in both categories in the published tables. By contrast, estimated wage and salary earnings amounts are shown only for work in that category, and self-employment earnings amounts are shown only for work while self-employed.
8 Although the MGD file contains records for about 98 percent of U.S. workers each year, ORES experimented with using a 10 percent sample—which would significantly streamline processing—to test the extent to which it could reduce cell suppression.
9 SSA's data files do not accommodate other sex designations.
10 Because the unpublished OCACT estimates do not estimate worker counts by sex, the MGD-process estimates are compared only with those of the statistical publications.
11 Because the unpublished OCACT estimates do not estimate worker counts by age, the MGD-process estimates are compared only with those of the statistical publications.
12 As ORES explores potential methodologies for adding nonprimary tax year data to the MGD process, it must assess how many of the workers with nonprimary tax year data are already in the annual MGD files and whether to cap the number of follow-up years for which it will continue to add nonprimary tax year data.
13 A 100 percent sample would obviously provide the most granular results and minimize cell suppression, but it requires comparatively slow and cumbersome processing. A 10 percent sample might provide most of the advantages of the full population while requiring far fewer data processing resources.
References
Compson, Michael. 2022. “Improving County-Level Earnings Estimates with a New Methodology for Assigning Geographic and Demographic Information for U.S. Workers.” Social Security Bulletin 82(1): 11–28.
———. 2024 “Evaluating a New Process for Assigning Geographic Residence Codes and Identifying Demographic Information for Workers in a Given Tax Year.” Social Security Bulletin 84(1): 1–47.