IPUMS data. This exercise is designed for the Integrated Public Use Microdata Series (IPUMS), available online at www.ipums.org/usa/ (Ruggles et al., 2004). The IPUMS site hosts a collection of samples from the U.S. Decennial Census andAmerican Community Survey. In the following exercises, we use a self-weighting sample selected from the 1980 Decennial Census sample, selected using the â€œSmall Sample Densityâ€ option in the data extract tool. The data are in file ipums.dat. We treat these data as a population.
a The variable inctot is total personal income from all sources. Note from the documentation for the variable that it is â€œtopcodedâ€ at $75,000 to protect the confidentiality of the respondents. What effect does the topcoding have on estimates from the file?
b Draw a pilot sample (SRS) of size 50 from the IPUMS population. Use the sample variance you get for inctot to determine the sample size you need to estimate the average of inctot with a margin of error of 700 or less.
c Take an SRS of your desired sample size from the population. Estimate the total income for the population, and give a 95% CI. Make sure you save the seed number you use in SAS PROC SURVEYSELECT or other software so you can recreate this sample in later chapters.