Mutual funds. The websites of companies such as Fidelity (www.fidelity.com), Vanguard (www.vanguard.com), and T. Rowe Price (www.troweprice.com) list the mutual funds of those companies, along with some statistics about the performance of those funds. Take an SRS of 25 mutual funds from one of these companies. Describe how you selected the SRS. Find the mean and a 95% CI for the mean of a variable you are interested in, such as daily percentage change, or 1-year performance, or length of time the fund has existed.
Baseball data. This activity is due to Jenifer Boshes, who also compiled the data from Forman (2004) and publicly available salary information. The data file baseball.dat contains statistics on 797 baseball players from the rosters of all major league teams in November, 2004. In this exercise (which will be continued in later chapters), you will treat the file baseball.dat as a population and draw samples from it using different sampling designs.
a Take an SRS of 150 players from the file. Describe how you selected the SRS. Save your data set for use in future exercises (if you are selecting it using SAS PROC SURVEYSELECT, you can recreate the data set by using seed = number).
b Calculate logsal = ln(salary). Construct a histogram of the variables salary and logsal from your SRS. Does the distribution of salary appear approximately normal? What about logsal?
c Find the mean of the variable logsal, and give a 95% CI.
d Estimate the proportion of players in the data set who are pitchers, and give a 95% CI.
e Since you have the full data file for the population, you can find the true mean and proportion for the population. Do your CIs in (c) and (d) contain the true population values?