PROBLEM 1
The data set anthrop.csv contains the length of the middle nger and height of
N=3000 criminals. We will treat this data set as a population of size N=3000. You
will select a random sample of n=200 and calculate the following:

(a) Estimate the average height based on a simple random sample of n=200 subjects
and its standard error. Find a 95% con dence interval.

(b) Determine the sample size necessary to have an absolute error of at most 2 inches.
Use the whole data set to calculate the variance and compare the estimate of n
you get if you were to treat your current sample of n=200 as a pilot sample for a
future survey.

(c) Calculate the ratio estimate of average height and its standad error using nger
length as the auxiliary variable. Calculate a 95% con dence interval.

(d) Repeat the sample size determination but this time for the ratio estimate.

(e) Repeat estimation of average height but use a regression estimate and nd the
standard error.

(f) Compare average height  ySRS,  yr,  yreg on the basis of standard error. Which
estimate do you prefer and why?

(g) Compare ratio and regression estimate. Which one do you believe is more appro-
priate here and why. Use plots as necessary to support your argument.
Note: you are each requested to select your own random sample. If there are two
identical answers it will be considered cheating. The probability of this happening is so
small as to be practically zero.