本次美国代写是一个计量经济学的assignment
Part 1
The following questions are from Wooldridge’s Introductory Econometrics – 7e
Question 1
Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and McClain (1995), the following equation relates housing price (
) to the distance from a recently built garbage incinerator (
):
Q1-1 Interpret the coefficient on log(
). Is the sign of this estimate what you expect it to be?
Q1-2 Do you think simple linear regression provides an unbiased estimator of the ceteris paribus elasticity of
with respect to
? (Think about the city’s decision on where to put the incinerator)
Q1-3 What other factors about a house might affect its price? Might these be correlated with distance from the incinerator?
Question 2
Consider the savings function
where
is a random variable with
and
. Assume that
is independent of
.
Q2-1 Show that
, so that the key zero conditional mean assumption (A3) is satisfied.
Q2-2 Show that
, so that the homoscedasticity assumption (A4) is violated. In particular, the variance of
increases with
.
Q2-3 Provide a discussion that supports the assumption that the variance of savings increases with family income.
Question 3
We are interested in the birth weight (
) of infants and the number of cigarettes the mother smoked per day during pregnancy (
). The following simple regression was estimated using data on
births
Q3-1 What is the predicted birth weight when
= 0? What about when
(one pack a day)? Comment on the difference
Q3-2 Does this simple regression necessarily capture a causal relationship between the child’s birth weight and the mother’s smoking habits? Explain.
Q3-3 To predict a birth weight of 125 ounces, what would
have to be? Comment.
Q3-4 The proportion of women in the sample who did not smoke while pregnant is about 0.85. Does this help reconcile your finding from part (3)?
Part 2
These next questions will be using R
Question 4
In this question we will compare the difference between the finite sample properties and large sample properties of OLS. Let’s say the population regression is
where
= 3
= 5
Q4-1 Simulate
(i.e., 5000 data points), save it as a data frame, and plot the histogram of
and
. Properly label your graphs (you will lose points if you don’t – you can add + xlab("appropriate label for X") + ylab("appropriate label for Y")
to your line of code)
Q4-2 Now let’s show the unbiasedness of
.
Do the following steps in R
.
- Create a function that will calculate
and from a sample of sizeN
- Initiate your function using
regOLS <- function(N){
}
- Now inside your function, use
samp <- df[sample(nrow(df), N), ]
to draw a sample of sizeN
from your data frame and save it assamp
- Calculate the OLS
and based yoursamp
data - Have your function return
data.frame(b0 = __, b1 = __)
- Create 4 empty data frames to store your values of
and
val1 <- data.frame(b0 = double(), b1 = double())
val2 <- data.frame(b0 = double(), b1 = double())
val3 <- data.frame(b0 = double(), b1 = double())
val4 <- data.frame(b0 = double(), b1 = double())
- Using a
for
loop, run yourregOLS
function for 100, 500, 1000, 5000 times, saving
and each time into yourval1
,val2
,val3
,val4
dataframes, respectively. Use for your sample size, so that you are running the regression on a sample of size 5 each time.val1
should have size 100,val2
should have size 500, and so on. - Report the average of
and for each of yourval
data frames by running the following code as is:
results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'N'] =c(100,500,1000,5000)
results[1,2:3] = colMeans(val1)
results[2,2:3] = colMeans(val2)
results[3,2:3] = colMeans(val3)
results[4,2:3] = colMeans(val4)
print(results)
Show the output of print(results)
for credit.
Q4-3 Interpret your results. Does having a small sample size of 5 matter in terms of expected values?
Q4-4 Since we simulated the unbiasedness of the OLS estimator, now let’s simulate the consistency of it. Using the same regOLS
function from before, run the function four times with
each. You can just run the code below.
results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'n'] =c(10,50,500,5000)
results[1,2:3] = colMeans(regOLS(10))
results[2,2:3] = colMeans(regOLS(50))
results[3,2:3] = colMeans(regOLS(500))
results[4,2:3] = colMeans(regOLS(5000))
print(results)
Show the output of print(results)
for credit.
Q4-5 Interpret your results. What happens to your estimators as
increases?
Q4-6 Explain the difference between unbiasedness (finite sample property) and consistency (large sample / asymptotic property).