1. Consider the simple hypothetical example in Table 1. This example involves eleven patients each of whom is infected with COVID. There are two treatments: ven- tilators Y 1 and bedrest Y 0. Table 1 displays each patient’s potential outcomes in terms of years of post-treatment survival under each treatment. Larger outcome values corre- spond to better health outcomes. [15 points]

1. a)  Calculate each unit’s treatment effect. [3]
2. b)  What is the average treatment effect (ATE) for ventilators compared to bedrest? Which type of intervention is more effective on average? [3]
3. c)  Suppose the “perfect doctor” knows each patient’s potential outcomes and as a result chooses the best treatment for each patient. If she assigns each patient to
the treatment more beneficial for that patient, which patients will receive ventilators
and which will receive bedrest? [3]
4. d)  Calculate the simple difference in average outcomes that would obtain if treatment assignment happened as in part (c). How similar is it to the ATE? [3]
5. e)  Provide an example of how SUTVA might be violated for treatments of COVID. [3]

2. In this exercise you will estimate the effect of lecture attendance on academic performance using the data in “ATTEND”. [20 points]

1. a)  Use OLS to estimate a regression model relating stndfnl (the standardized final exam score) to atndrte (the percent of lectures attended). Include the binary variables frosh
and soph as explanatory variables. Interpret the coefficient on atndrte, and discuss
its statistical significance. [5]
2. b)  How confident are you that the OLS estimates from part (i) are estimating the causal effect of attendance? Explain your answer. [5]
3. c)  As proxy variables for student ability, add to the regression priGPA (prior cummu- lative GPA) and ACT (achievement test score). Now what is the effect of atndrte? Discuss how the effect differs from that in part (a). [5]
4. d)  To test for a nonlinear effect of atndrte, add its square to the equation from part (c). What do you conclude? [5]

3. In this exercise you will estimate the effect of cigarette smoking during preg-

nancy on the weight of newborns using the data in “BWGHT”. Consider the following specification:

log(bwght) = β0 + β1male + β2parity + β3log(faminc) + β4packs + u, (1)

where male is a binary indicator equal to one if the child is male, parity is th ebirth order of this child, faminc is family income, and packs is the average number of packs of cigarettes smoked per day during pregnancy. [30 points]

1. a)  Why might you expect packs to be correlated with u? [6]
2. b)  Suppose that you have data on average cigarette price in each woman’s place of residence. Discuss whether this information is likely to satisfy the properties of a good instrumental variable for packs. [6]
3. c)  Use the data in “BWGHT” to estimate equation (1) using OLS. [6]

d) Now estimate equation (1) using 2SLS, where cigprice is an instrument for packs. Discuss how your OLS estimates compare to the 2SLS estimates. [6]

e) Estimate the reduced formREVISE for packs. What do you conclude about identifi- cation of equation (1) using cigprice as an instrument for packs? [6]

4. Use the data in “WAGEPAN” for this exercise, which is a panel dataset of 545 men who worked every year from 1980 to 1987. Consider the wage equation:

log(wage ) = β +β educ +β black +β hisp +β exper +β exper2 +β married +β union +c +u . it 0 1 i 2 i 3 i 4 it 5 it 6 it 7 it i it

(2) The variables are described in the dataset. Notice that education does not change

over time. [20 points]

1. (a)  Estimate equation (2) by pooled OLS. Are the usual OLS standard errors reli- able, even if ci is uncorrelated with all explanatory variables? Explain. Compute appropriate standard errors. [6]
2. (b)  Estimate equation (2) by Random Effects. Compare your estimates with the pooled OLS estimates in part (a). [7]
3. (c)  Estimate equation (2) by Fixed Effects. Compare your estimates with the RE estimates in part (b). [7]

5. A researcher is concerned with estimating the effect of the level of unemployment insurance benefits on the length of unemployment spells. She finds out that recently US state Blue changed its unemployment insurance programme so that workers with earnings above a certain threshold (group H) will receive higher benefits if they become unemployed, whereas for workers below the earnings threshold (group L) unemployment benefits remain unchanged. The researcher collects information on average unemploy- ment duration (in weeks) in State Blue and neighbouring State Red for both groups of workers (H and L), from the year before the policy change and for the year after. [15 points]

1. (a)  Using the data provided, construct two alternative difference-in-difference esti- mates of the effect of unemployment benefits on unemployment duration. Discuss
the key assumptions underlying the validity of your estimates in each case. [5]
2. (b)  Using the data provided, construct a difference-in-difference-in-difference estimate
of the effect of unemployment benefits on unemployment duration. Discuss the assumption underlying the validity of this estimate. [10]