A New Way to Estimate the Number of Unauthorized Immigrants in the United States
Social Security Bulletin, Vol. 85 No. 2, 2025 (released June 2025)
This article introduces a new method of estimating the number of unauthorized immigrants in the United States by exploiting discrepancies between Current Population Survey (CPS) data and Social Security administrative data on Social Security numbers (SSNs). Potential unauthorized immigrant status is indicated when the SSNs reported by CPS respondents and the SSNs recorded in linked administrative data do not match. We use nonmatching SSN data to identify likely unauthorized immigrants and apply a series of logical adjustments to refine the estimated population counts. The resulting estimates are consistent with those calculated using the residual estimation method, which we described in the first of this group of three related articles. Because the residual method and this new method take entirely different approaches, the similarity of their results is mutually reinforcing.
Robert Gesumaria is a researcher and IT specialist with the Social Security Administration (SSA). Harriet Duleep is a researcher with SSA; a research professor with the Public Policy Program, College of William and Mary; and a research fellow with the Institute for the Study of Labor (IZA) and with the Global Labor Organization. Christopher Tamborini is a researcher and Dave Shoffner is an analyst, also with SSA.
Acknowledgments: We conducted this study under the direction and with the support of Mark J. Warshawsky. We thank Stephanie Myers, Anya Olsen, Mark Sarney, Steve Goss, Tokunbo Oluwole, Ben Danforth, Kent O. Morgan, Steve Robinson, Michael Morris, and Gayle Reznik. We would especially like to thank Mark Regets (Senior Fellow, National Foundation for American Policy, former immigration expert with the National Science Foundation), Robert Warren (Senior Visiting Fellow at the Center for Migration Studies of New York, who served as a demographer for 34 years with the Census Bureau and the former Immigration and Naturalization Service), and Ben Pitkin (editor, Social Security Bulletin).
Contents of this publication are not copyrighted; any items may be reprinted, but citation of the Social Security Bulletin as the source is requested. The findings and conclusions presented in the Bulletin are those of the authors and do not necessarily represent the views of the Social Security Administration.
Introduction
CPS | Current Population Survey |
DHS | Department of Homeland Security |
SSA | Social Security Administration |
SSN | Social Security number |
This article introduces a new method of estimating the unauthorized immigrant population in the United States by exploiting discrepancies between data from the Census Bureau's Current Population Survey (CPS) and administrative records maintained by the Social Security Administration (SSA). In some cases, the Social Security numbers (SSNs) reported by survey respondents do not match the SSNs in the administrative data. Among foreign-born individuals, these SSN nonmatches often indicate unauthorized status. Our new CPS-SSA nonmatch estimation method separately addresses the two causes of nonmatching SSNs: individuals without a valid SSN (Type 1 nonmatches) and individuals who use a valid SSN that belongs to another person (Type 2 nonmatches).
The article comprises six sections, beginning with this introduction. The second section presents background information on the CPS-SSA data linkage. The third and fourth sections respectively describe the circumstances that lead to Type 1 and Type 2 SSN nonmatches, and our method of using each type of mismatched data as a step in the process of estimating the unauthorized immigrant population. The fifth section offers information for analysts planning to use or improve the CPS-SSA nonmatch method. The sixth section concludes by comparing the numbers of unauthorized immigrants estimated using our CPS-SSA nonmatch method with those estimated using the residual method, which the first of these three related articles describes (Duleep and others 2025).
Background
Matching federal survey data with administrative data records is a critical research tool, and the SSN plays a crucial role in matching CPS data with SSA data files (Aziz, Kilss, and Scheuren 1978; Delbene 1979; Duleep 1986). Nevertheless, over time, CPS respondents have become increasingly reluctant to provide their SSN and thereby enable the administrative-data linkages. To overcome this obstacle, in the early 2000s,
the Census Bureau stopped directly requesting an SSN. Instead, under a new methodology, a respondent is informed that the survey data will be matched with other federal data for research purposes. Unless the respondent opts out, the Census Bureau then combines SSN application information from SSA's [Numerical Identification System data] file with address records from the [Internal Revenue Service], SSA, and other sources to determine the respondent's correct SSN. Once a match is found, survey and administrative data for the respondent are linked (McNabb and others 2009).
Importantly, SSNs are not disclosed in any data set used for research. Instead, to protect the individuals' identities, they are replaced with coded proxy identifiers.1
Methods
To estimate the size of the unauthorized immigrant population, we examine linked CPS and SSA data for foreign-born individuals and quantify the instances in which the reported SSNs do not match. A Type 1 nonmatch occurs if there is no valid SSN: The respondent either has no SSN at all or has a fraudulent SSN, meaning that it was fabricated by the respondent or by his or her employer. A Type 2 mismatch occurs if the respondent has a valid SSN, but that SSN legitimately belongs to another person. For this analysis, we specifically use SSA's Numerical Identification System (Numident) file and CPS Annual Social and Economic Supplement (ASEC) data for 2006 through 2016. We then compare the prevalence of CPS-SSA data nonmatches for immigrants with the nonmatch rate of U.S.-born respondents, as detailed below.
Type 1 Nonmatches
The CPS results contain data for individuals with Type 1 nonmatches (no SSN or a nonvalid SSN), but the Social Security administrative record system does not. The survey and administrative data for these individuals therefore cannot match. Table 1 shows the number and prevalence of these nonmatches among persons aged 15 or older.
Survey year | Foreign-born population | Type 1 SSN nonmatches | Estimated percentage of the foreign-born population who are unauthorized immigrants a | Estimated unauthorized immigrant population based on Type 1 nonmatches b | |||
---|---|---|---|---|---|---|---|
Number | Share of U.S. population (%) | Foreign-born | U.S.-born (as a percentage of total U.S.-born population) | ||||
Number | Share of foreign-born population (%) | ||||||
2006 | 33,571,249 | 14.40 | 9,924,156 | 29.56 | 8.54 | 21.02 | 7,055,651 |
2007 | 35,063,411 | 14.86 | 10,242,443 | 29.21 | 8.36 | 20.85 | 7,311,041 |
2008 | 35,180,322 | 14.77 | 10,547,283 | 29.98 | 9.30 | 20.68 | 7,276,442 |
2009 | 34,884,933 | 14.53 | 10,280,222 | 29.47 | 9.36 | 20.11 | 7,016,271 |
2010 | 35,682,735 | 14.73 | 8,117,290 | 22.75 | 9.53 | 13.22 | 4,716,841 |
2011 | 36,479,785 | 14.95 | 6,812,580 | 18.67 | 8.59 | 10.08 | 3,679,485 |
2012 | 38,195,263 | 15.42 | 6,979,920 | 18.27 | 9.35 | 8.92 | 3,407,508 |
2013 | 38,517,423 | 15.41 | 8,872,147 | 23.03 | 9.98 | 13.05 | 5,027,376 |
2014 | 39,212,327 | 15.53 | 8,969,626 | 22.87 | 10.20 | 12.67 | 4,971,488 |
2015 | 40,556,084 | 15.89 | 9,559,729 | 23.57 | 10.48 | 13.09 | 5,311,430 |
2016 | 41,346,254 | 16.03 | 9,879,161 | 23.89 | 10.78 | 13.11 | 5,420,932 |
SOURCE: Authors' calculations based on CPS-ASEC and Social Security administrative data. | |||||||
NOTE: A CPS respondent with no SSN or a fabricated SSN is a Type 1 SSN nonmatch. | |||||||
a. Equals the nonmatch share of foreign-born population minus the nonmatch share of U.S.-born population. | |||||||
b. Equals the foreign-born population times the estimated percentage who are unauthorized. |
The number of foreign-born nonmatches in Table 1 might appear to provide logical estimates of the size of the unauthorized immigrant population. Yet SSN reporting errors can cause mismatches between the CPS data and SSA records that have nothing to do with unauthorized immigration. Because there are no unauthorized immigrants among the native-born population, the nonmatch rate for that group provides a control for estimating the shares of nonmatches that occur for reasons other than unauthorized immigration.
To estimate the percentages of foreign-born nonmatches that are due to unauthorized immigration, we subtract the nonmatch rate among the native-born respondents from the nonmatch rate for all immigrants. We then compute the unauthorized immigrant population by multiplying the total foreign-born population by the resulting percentage, as shown in Table 1.
One concern arises: The match probabilities may correlate with demographic and socioeconomic characteristics, and the distribution of these variables may differ between the foreign-born and U.S.-born populations. To address this concern, we reweighted the native-born sample to align with the foreign-born sample by age, sex, and education. We omit these results, however, because reweighting the sample on these characteristics barely changed the estimated unauthorized immigrant population.
Type 2 Nonmatches
When a CPS respondent is an unauthorized immigrant who uses or has used someone else's valid SSN, the SSN will appear in both the CPS and the SSA data. To estimate the number of unauthorized immigrants who use, or have used, another person's valid SSN, we distinguish between two types of SSN matches in the CPS and SSA data. In an affirmative match, the individual's CPS data and the SSA data match on key variables, such as sex and birth year. A dubious match occurs when the individual's CPS and SSA data do not match on key variables. Thus, we define a match as dubious if the individual's CPS data and SSA records differ either in sex or in age (if by more than 5 years). The count of dubious matches among the foreign-born may provide a good estimate of the number who are using someone else's valid SSN. We refer to the number of dubious matches divided by the total number of all matches (affirmative plus dubious) in the CPS and SSA data as the discrepancy rate. As we did with Type 1 nonmatches, we use the U.S.-born population as a control to account for Type 2 discrepancies that are caused by reasons other than unauthorized immigration. We subtract the discrepancy rate for the U.S.-born population from the discrepancy rate for the foreign-born population to determine the net discrepancy rate.
We apply the net discrepancy rate to the number of all immigrants who have CPS-SSA data matches (whether affirmative or dubious). The resulting number is added to the unauthorized immigrant population that was estimated based on Type 1 nonmatches (Table 2).
Survey year | Estimated unauthorized immigrant population based on Type 1 SSN nonmatches only (from Table 1) | Net discrepancy rate a (%) | Additional unauthorized immigrants based on Type 2 SSN nonmatch analysis | Total estimated unauthorized immigrant population based on nonmatching SSNs |
---|---|---|---|---|
2006 | 7,055,651 | 3.85 | 271,942 | 7,327,593 |
2007 | 7,311,041 | 4.72 | 345,011 | 7,656,052 |
2008 | 7,276,442 | 4.23 | 307,913 | 7,584,355 |
2009 | 7,016,271 | 3.61 | 253,429 | 7,269,700 |
2010 | 4,716,841 | 63.64 | 3,001,877 | 7,718,718 |
2011 | 3,679,485 | 90.79 | 3,340,527 | 7,020,012 |
2012 | 3,407,508 | 104.34 | 3,555,428 | 6,962,936 |
2013 | 5,027,376 | 35.14 | 1,766,858 | 6,794,234 |
2014 | 4,971,488 | 31.51 | 1,566,572 | 6,538,060 |
2015 | 5,311,430 | 29.24 | 1,552,917 | 6,864,347 |
2016 | 5,420,932 | 31.29 | 1,696,076 | 7,117,008 |
SOURCE: Authors' calculations based on CPS-ASEC and Social Security administrative data. | ||||
NOTES: A CPS respondent with no SSN or a fabricated SSN is a Type 1 SSN nonmatch.
A CPS respondent with a valid SSN that legitimately belongs to another person is a Type 2 SSN nonmatch.
|
||||
a. The difference in SSN-match discrepancy rates between U.S.-born and foreign-born CPS respondents. |
Comparing CPS-SSA Nonmatch Method and Residual Method Estimates
As the first of our three articles discusses, the residual method of estimating the unauthorized immigrant population includes steps that account for survey undercounts and for immigrants who entered the United States with legal temporary visas but then overstayed them.
Adjusting for Undercounts
Researchers account for American Community Survey and CPS undercounts by adjusting their estimated counts of unauthorized immigrants upward by 5 percent to 15 percent. We use 10 percent, the midpoint of those adjustments, to offset undercounting in the CPS. Specifically, we assume that the unauthorized immigrant population figures based on nonmatching SSNs in Table 2 represent 90.9 percent of the true population (that is, the population accounting for CPS undercounts). Table 3 shows the figures adjusted to equal 100 percent of those counts.
Survey year | Total estimated unauthorized immigrant population based on nonmatching SSNs (from Table 2) | Adjusting to account for— | |||
---|---|---|---|---|---|
CPS undercount: Add 10% | Visa overstays not captured in nonmatching SSN analysis: Add another 5% | The all-ages unauthorized immigrant population if the share of the population aged 0–14 equals— | |||
6.9% a | 20% b | ||||
2006 | 7,327,593 | 8,060,352 | 8,463,370 | 9,047,343 | 10,156,044 |
2007 | 7,656,052 | 8,421,657 | 8,842,740 | 9,452,889 | 10,611,288 |
2008 | 7,584,355 | 8,342,791 | 8,759,931 | 9,364,366 | 10,511,917 |
2009 | 7,269,700 | 7,996,670 | 8,396,504 | 8,975,863 | 10,075,805 |
2010 | 7,718,718 | 8,490,590 | 8,915,120 | 9,530,263 | 10,698,144 |
2011 | 7,020,012 | 7,722,013 | 8,108,114 | 8,667,574 | 9,729,737 |
2012 | 6,962,936 | 7,659,230 | 8,042,192 | 8,597,103 | 9,650,630 |
2013 | 6,794,234 | 7,473,657 | 7,847,340 | 8,388,806 | 9,416,808 |
2014 | 6,538,060 | 7,191,866 | 7,551,459 | 8,072,510 | 9,061,751 |
2015 | 6,864,347 | 7,550,782 | 7,928,321 | 8,475,375 | 9,513,985 |
2016 | 7,117,008 | 7,828,709 | 8,220,144 | 8,787,334 | 9,864,173 |
SOURCE: Authors' calculations based on CPS-ASEC and Social Security administrative data. | |||||
a. Assumes the share of the population aged 0–14 is lower among unauthorized immigrants than in the overall U.S. population. | |||||
b. Assumes the share of the population aged 0–14 is similar between unauthorized immigrants and the overall U.S. population. |
Accounting for Visa Overstays
B1 tourist visas and B2 business trip visas account for about 92 percent of visa overstays (Department of Homeland Security [DHS] 2021). Most people who overstay tourist or business visas are unlikely to have an SSN history. Yet, many visitors holding other types of visas are eligible for temporary employment authorization. This suggests that we could subtract the number of individuals who overstayed a B1 or B2 visa from the total number of visa overstays to estimate the number of visa overstays with an SSN history. Unfortunately, estimates of the population of visa overstays are not available; however, estimated net annual flows in visa overstays are. Warren (2019) estimates that 46 percent of the unauthorized immigrant population in 2017 overstayed a visa and DHS (2021) reports that in 2019, about 8 percent of overstays held nonbusiness or nontourist visas. Eight percent of 46 percent is about 3.7 percent. To reduce the risk of underestimating overstay incidence, we adjust 3.7 percent up to 5 percent, then add that 5 percent to the estimated population of unauthorized immigrants in Table 3.
Estimates for All Ages
Our estimates are calculated for the unauthorized immigrant population aged 15 or older but the residual method estimates, discussed in the first of these three articles, are calculated for the all-ages population. The final step of our estimation method is to reconcile that difference. For the period 2006–2016, about 20 percent of the U.S. population was aged 14 or younger (Census Bureau 2023). Yet the motivations and the logistics of undocumented immigration are likely to result in a disproportionally low presence of children younger than 15 in the unauthorized immigrant population. Among foreign-born U.S. residents who arrived in the period 1982–2019, the 2019 American Community Survey found that 6.9 percent were younger than 15. Table 3 therefore shows our computations with both 6.9 percent and 20 percent adjustments to provide alternative estimates of the all-ages unauthorized immigrant population.
Data Limitations and Notes for Future Research
We likely underestimate the percentage of unauthorized immigrants who overstay their visas and have a valid SSN because the estimates are based on flow data rather than on “snapshot” data for entire populations at particular points in time. If available, snapshot data should be used to inform these estimates.
Our estimates ignore individuals who overstayed a visa but now reside outside the United States and assume that individuals overstaying a B1 or B2 business or tourist visa do not have any administrative records at SSA. We are not certain whether our algorithm counts holders of F1 visas, who are eligible for Optional Practical Training (which can last from 6 months to 27 months), as authorized or unauthorized immigrants.
The Census Bureau's Person Identification Validation System (PVS) matches survey responses with SSA data without disclosing SSNs. The PVS uses probabilistic matching to assign a unique Census Bureau identifier for each person (Wagner and Layne 2014). Analogous to data fingerprints, the unique non-SSN identifying information that the PVS uses will not find matches in SSA data for persons who have never applied for and received SSNs. Because these persons have never given their identifying information to SSA or the Internal Revenue Service, they have no data in the administrative records. Thus, the PVS allows us to infer that immigrant survey respondents who have no matching SSA data do not have a valid SSN, suggesting that they may be unauthorized immigrants.
Our methodology focuses on the number of unauthorized immigrants and not their characteristics, which we explore in the second of our three articles (Tamborini and others 2025). Subject to further investigation, the CPS-SSA nonmatch method may provide a convenient way to continuously measure both the size and characteristics of the U.S. unauthorized immigrant population.
Summary and Conclusion
Each year, the SSA actuaries forecast the financial status of the Old-Age, Survivors, and Disability Insurance programs by projecting U.S. labor force participation, earnings, and other variables. These long-term projections incorporate assumptions about the relationship between immigration and Social Security. In describing the unauthorized immigrant population and presenting methods for estimating its size, our three articles may provide insights to inform those assumptions.
To date, two estimation methodologies have dominated efforts to measure the number of unauthorized immigrants in the United States. The first, the residual method, is described in detail in the first of these three articles (Duleep and others 2025). It involves subtracting from the count of all foreign-born individuals residing in the United States the numbers with legal-resident status. The results represent an estimate of the unauthorized immigrant population.
The second approach uses enforcement statistics such as border apprehensions. An attractive feature of this approach is that it starts with known information about who we are trying to measure—unauthorized immigrants. Yet a single person may cross the border and return multiple times. If each apprehension is counted as a new entrant, then this method overestimates the number of unauthorized immigrants. The number of border-crossing agents will also affect how many unauthorized immigrants are counted: with more agents, more apprehensions occur and are counted. Given these shortcomings, the enforcement-statistics estimation method is not used as often as the residual method, which is preferred by DHS and various research institutes.
Consistency of results implies accuracy, and studies that use the residual method find similar results. Skeptics note, however, that the accuracy of the residual method estimates are difficult to verify, given that they share a similar methodology. Perhaps all are consistently wrong? Would a valid but different methodology find similar results?
Motivating our study was a concern that the residual method may dramatically understate unauthorized immigration. To explore this concern, we developed an alternative estimation method. We employ a unique, restricted-use dataset linking data for respondents from multiple years of the CPS to their administrative records compiled at SSA. The CPS-SSA nonmatch method counts two types of unauthorized immigrants: those who do not have a valid SSN and those who use the valid SSN of another person. The CPS-SSA nonmatch method differs completely from the residual method. If our estimates of the unauthorized immigrant population are similar to residual method estimates, it cannot be due to methodologic similarities.
As discussed in the first of our three articles, the Center for Migration Studies of New York (CMS) has used its own version of residual techniques to produce annual estimates of the unauthorized immigrant population from 2010 to 2019, providing greater detail than the DHS estimates. Nevertheless, CMS estimated the total unauthorized immigrant population for 2010 at 11.7 million (Warren and Warren 2013), only slightly more than DHS' estimate of 11.6 million (Baker 2021). The results for our CPS-SSA nonmatch method and from CMS and DHS using the residual method are similar: We estimate an unauthorized immigrant population of 10.7 million in 2010 (Table 3, using the 20 percent adjustment to expand the counted population from those aged 15 or older to those of all ages).
Estimates of the unauthorized immigrant population over time using the residual and CPS-SSA nonmatch methods are also broadly similar. Table 3 shows that the CPS-SSA nonmatch estimates of the number of unauthorized immigrants peaked in 2010, followed by 4 consecutive years of small decreases. The numbers then increased in 2015 and 2016. Similarly, following 2010, CMS estimates using a modified residual method show several years of declining unauthorized immigration, until 2022 when the estimated number of unauthorized immigrants increased 6 percent (Warren 2024).
In conclusion, our different methodology produces estimates of the size of the unauthorized immigrant population in the United States—and of unauthorized immigration trends—that are broadly similar to those produced using the residual method. We find no evidence that the residual method underestimates unauthorized immigration. The similarity in results is important both for national policy discussions about unauthorized immigrants in the United States and for the specific policy needs of Social Security.
Notes
1 For more information on Census Bureau matching procedures, see Wagner and Layne (2014).
References
Aziz, Faye, Beth Kilss, and Fritz Scheuren. 1978. 1973 Current Population Survey: Administrative Record Exact Match File Codebook, Part I—Code Counts and Item Definitions. Studies from Interagency Data Linkages, Report No. 8. DHEW Publication No. (SSA) 79-11750. Washington, DC: Department of Health, Education, and Welfare, SSA, Office of Research and Statistics.
Baker, Bryan, 2021. Estimates of the Unauthorized Immigration Population Residing in the United States: January 2015–January 2018. Washington, DC: Department of Homeland Security, Office of Immigration Statistics. https://ohss.dhs.gov/topics/immigration/unauthorized/population-estimates.
Census Bureau. 2023. “National Population by Characteristics: 2010–2019. Median Age and Age by Sex, Race, and Hispanic Origin, Annual Estimates of the Resident Population by Sex, Age, Race, and Hispanic Origin for the United States: April 1, 2010 to July 1, 2019.” https://www.census.gov/data/tables/time-series/demo/popest/2010s-national-detail.html.
Delbene, Linda. 1979. 1937–1976 Social Security Longitudinal Earnings Exact Match File. Studies from Interagency Data Linkages, Report No. 9. Washington, DC: Department of Health, Education, and Welfare, SSA, Office of Research and Statistics.
[DHS] Department of Homeland Security. 2021. Fiscal Year 2020 Entry/Exit Overstay Report. Washington, DC: DHS, U.S. Customs and Border Protection. https://www.dhs.gov/publication/entryexit-overstay-report.
Duleep, Harriet Orcutt. 1986. “Measuring the Effect of Income on Adult Mortality Using Longitudinal Administrative Record Data.” The Journal of Human Resources 21(2): 238–251.
Duleep, Harriet, Dave Shoffner, Robert V. Gesumaria, and Christopher R. Tamborini. 2025. “Measuring the Number of Unauthorized Immigrants in the United States: A Review of the Residual Estimation Method.” Social Security Bulletin 85(2): 1–8.
McNabb, Jennifer, David Timmons, Jae Song, and Carolyn Puckett. 2009. “Uses of Administrative Data at the Social Security Administration.” Social Security Bulletin 69(1): 75–84.
Tamborini, Christopher R., Harriet Duleep, Robert V. Gesumaria, and Dave Shoffner. 2025. “Measuring the Economic and Sociodemographic Characteristics of Unauthorized Immigrants in the United States with Survey Data.” Social Security Bulletin 85(2): 9–15.
Wagner, Deborah, and Mary Layne. 2014. “The Person Identification Validation System (PVS): Applying the Center for Administrative Records Research and Applications' (CARRA) Record Linkage Software.” CARRA Working Paper No. 2014-01. Washington, DC: Census Bureau.
Warren, Robert. 2019. “Detailed Estimates of the Overstay Population Residing in the United States in 2017.” New York, NY: Center for Migration Studies of New York. https://cmsny.org/visa-overstay-population-warren-120219/.
———. 2024. “After a Decade of Decline, the US Undocumented Population Increased by 650,000 in 2022.” Journal on Migration and Human Security 12(2): 85–95. https://doi.org/10.1177/23315024241226624.
Warren, Robert, and John Robert Warren. 2013. “Unauthorized Immigration to the United States: Annual Estimates and Components of Change, by State, 1990 to 2010,” International Migration Review 47(2): 296–329. https://onlinelibrary.wiley.com/doi/abs/10.1111/imre.12022.