Appendix A: Reliability of the Estimates
Because the figures in this report are based on a sample of the older population, all reported statistics (counts, percentages, and medians) are only estimates of population parameters and may deviate somewhat from their true values—that is, from the values that would have been obtained from a complete census using the same questionnaires, instructions, and interviewers.1
The standard error is primarily a measure of sampling variability—that is, it measures the variations that occur by chance because a sample rather than the entire population is surveyed. As calculated for this report, the standard error also partly measures the effect of response and enumeration errors but does not measure systematic biases in the data. The chances are about 68 out of 100 that an estimate for the sample would differ from a complete census figure by less than the standard error. The chances are about 95 out of 100 that the difference would be less than twice the standard error.
Standard Error of Estimated Percentages
The reliability of an estimated percentage, computed by using sample data for both numerator and denominator, depends on both the size of the percentage and the size of the total on which the percentage is based. The approximate standard error Sx of an estimated percentage can be obtained using the formula
Here x is the total number of persons, families, or households (the base of the percentage), p is the percentage, and b is the parameter from the following table associated with the characteristic in the numerator of the percentage.
|Below poverty level||1,998||1,998||1,998|
|All income levels||1,249||1,430||1,430|
Use of this formula in calculating the standard error of a single percentage is illustrated as follows:
An estimated 30.9 percent of units aged 65 or older had total money income of $30,000 or more in 2002 (Table 3.1). Because the base of this percentage is approximately 26,219,000—the number of units aged 65 or older—the standard error of the estimated 30.9 percent is approximately 0.3 percent. The chances are 68 out of 100 that the estimate would have shown a figure differing from a complete census by less than 0.3 percent. The chances are 95 out of 100 that the estimate would have shown a figure differing from a complete census by less than 0.6 percent—that is, this 95 percent confidence interval would range from 30.3 percent to 31.5 percent.
For a difference between two sample estimates, the standard error is approximately equal to the square root of the sum of the squares of the standard errors of each estimate considered separately. This formula will represent the actual standard error quite accurately for the difference between separate and uncorrelated characteristics. If, however, there is a high positive correlation between the two characteristics, the formula will overestimate the true standard error.
A comparison of the difference in the percentage of units aged 62–64 and 65 or older who had total money income of $30,000 or more in 2002 illustrates how to calculate the standard error of a difference between two percentages:
Thirty-one percent of the 26,219,000 units aged 65 or older and 52 percent of the 4,722,000 units aged 62–64 had total money income of $30,000 or more in 2002—a difference of 21 percentage points. The standard errors of those percentages are 0.3 and 0.8, respectively. The standard error of the estimated difference of 21 percentage points is about
The chances are 68 out of 100 that the difference is between 20.1 and 21.9 percentage points and 95 out of 100 that it is between 19.2 and 22.8 percentage points. Because the confidence interval around the difference does not include zero, there is a statistically significant difference between the proportions who are aged 62–64 and those who are aged 65 or older with income of $30,000 or more.
Confidence Limits of Medians
The sampling variability of an estimated median depends on the distribution as well as on the size of the base. Confidence limits of a median based on sample data may be estimated as follows: (1) using the appropriate base, the standard error of a 50 percent characteristic is determined; (2) the standard error determined in step 1 is added to and subtracted from 50 percent; and (3) the confidence interval around the median corresponding to the two points estimated in step 2 is then read from the distribution of the characteristic. A two-standard-error confidence limit may be determined by finding the values corresponding to 50 percent plus and minus twice the standard error. This procedure may be illustrated as follows:
The median total money income of the estimated 26,219,000 units aged 65 or older was $18,938 in 2002 (Table 3.1). The standard error of 50 percent of those units expressed as a percentage is about 0.35 percent. As interest usually centers on the confidence interval for the median at the two-standard-error level, it is necessary to add and subtract twice the standard error obtained in step 1 from 50 percent. This procedure yields limits of approximately 49 percent and 51 percent. By interpolation, 49 percent of units aged 65 or older had total money income below $18,750, and 51 percent had total money income below $19,508. Thus, the chances are about 95 out of 100 that the census would have shown the median to be greater than $18,750 but less than $19,508.
1 Most of the discussion of estimation procedures has been excerpted from Current Population Reports, No. 114 (July 1978).