Question 1 (17 marks)
Background: The Gamma Distribution
In this question we will be looking at the gamma distribution. This is a distribution frequently used in the analysis of non-negative real numbers. This is introduced in Lecture 6, but here we will be using a difffferent (but equivalent) form parameterised instead in terms of the mean µ and shape φ with probability density p(y | µ, φ) =1Γ(φ) φµ φ yφ−1 exp − φy µ . where Γ(·) is known as the “gamma function”. Figure 1 shows several example gamma distributions.
The following facts about the gamma distribution will prove very helpful for answering the questions.
If Y ∼ Ga(µ, φ) is a random variable following a gamma distribution with mean µ and shape φ then
E [Y ] = µ, and V [Y ] =µ2φ
so that µ controls the mean, and φ inversely scales the variance. Given a sample y = (y1, . . . , yn) the maximum likelihood estimate, and Fisher information, for µ are well known to be ˆµ = ¯y and Jn(µ) =nφ µ2
(1)respectively, where y¯ = (1/n)P n i=1yi is the sample mean. The estimates for φ are complicated, so initially we will proceed assumingφ is known; we will later relax this assumption.
Bayesian Analysis of the Gamma Distribution with Known φ
Let us now consider a Bayesian analysis of the gamma distribution with known φ using the hierarchy:
yi | µ ∼ Ga(µ, φ), i = 1, . . . , n
µ | a, b ∼ IG(a, b)
where IG(a, b) denotes an inverse-gamma distribution with shape a and scale b1 . A useful fact is that if Y ∼ IG(a, b) and a > 1 then E [Y ] =b a − 1.
One reason people use the inverse-gamma distribution as a prior for the mean of the gamma distribution is that it is flflexible; Figure 2 demonstrates several possible difffferent inverse-gamma distributions. It is also convenient, as the posterior distribution is then also an inverse gamma distribution of the form µ | y, φ, a, b ∼ IG (nφ + a, nφy¯ + b).
The posterior mean of µ is then given by
E [µ | y] ≡ ˆµa,b =φny¯ + b φn + (a − 1).
(2)which can be used as an estimate (best guess) of the unknown population µ. You should install the R package invgamma. This gives you access to the functions dinvgamma(), pinvgamma() and qinvgamma() which will be crucial in answering some questions.
Consider using (2) as an estimate of the unknown population µ from a sample of data y = (y1, . . . , yn) coming from a gamma distribution with mean µ and shape φ. The bias and variance of this estimator is biasµ(ˆµa,b) =(1 − a)µ + b nφ + a − 1, Vµ [ˆµa,b] =nµ2φ (nφ + a − 1)2
(3)Please answer the following questions. Provide appropriate working/mathematical justifification.
- If φ = 1, how can we interpret the hyperparameters a and b? [2 marks]
- Describe the dependency of the variance of the estimator on the population values of µ and φ. [1 mark]
- Is the estimator (2) asymptotically effiffifficient as n → ∞? [2 marks]
It is common to use so called “normalized” squared-error for many problems in which the mean and variance are linked. For the gamma distribution the normalised squared-error is
L(µ, ˆµ) =(ˆµ − µ)2(µ2/φ).
and the normalized squared-error risk is therefore
R(µ, ˆµ) = µφ2 E (ˆµ − µ)2 .
Please answer the following questions regarding this loss function.
- Why do you think the normalized squared-error loss might be preferable to the usual (unnormalized) squared-error loss? [1 mark]
- Making use of (3), what is the normalized squared-error risk for the estimator when b = 0? Comment on the risk. [2 marks]
- Prove that when a = 1, the choice b = 0 dominates the choice b > 0. [1 mark]
- Plot the normalized squared-error risk (expected normalized squared-error) for n = 5, φ = 2 and µ ∈ (0.01, 10) for (a = 1, b = 0) (i.e., maximum likelihood), (a = 1/2, b = 1) and (a = 2, b = 1/2).
You may plot all curves on the one plot. For readability, use a logarithmic scale for the y-axis and set the limits for the y-axis from (0.1, 2). Describe the key features of the risk curves. Over the range of µ values considered in our plot do any of the estimators appear to be inadmissable (that is, not admissable)? [3 marks]
- Given an a > 1 and a b > 0, for which population value of µ does the estimator (2) attain the smallest normalized squared-error risk? [2 marks]
Sometimes the alternative log-mean parameterisation v = log µ is used for the gamma distribution.
This is in some (statistical) ways a more natural parameterisation than the mean µ. Under this parameterisation the probability density is
p(y | v, φ) =1Γ(φ)
Question 2 (15 marks)
In this question we will use the gamma distribution to analyse some topical and relevant data: case numbers of people in Victoria, Australia infected with the novel coronavirus (Covid-19). The data we will use was obtained from the the Victorian government public health website. In particular, we will analyse the daily case numbers for the month of July, 2022 from covid.daily.cases.07.2022.csv.
You will also need to download the fifile gamma.fit.R. This contains functions that will assist you in fifitting a gamma distribution to the data; in particular it contains code for maximum likelihood estimation and Bayesian sampling.
Let us fifirst analyse the data using maximum likelihood and the asymptotic distribution of maximum likelihood estimates. Please answer the following questions. Provide working/justififications/code as required to support your answers.
- Fit a gamma distribution to the data using maximum likelihood via the function gamma.ml(y).What is the estimated mean number of daily cases µ for the month of July 2022? Provide an approximate 95% confifidence interval for this estimate using the asymptotic distribution of the maximum likelihood estimate. [3 marks]
- The average daily case numbers in the previous month was roughly 7,200. Using the asymptotic distribution of the ML estimate of µ, test the hypothesis that the average number of daily cases in July 2022 is the same as in June 2022. Clearly specify the hypothesis you are testing and discuss the resulting p-value. [3 marks]
Let us perform a Bayesian analysis. We will use the function gamma.sampling(y,a,b,n.samples) to generate samples from the posterior distribution for the following hierarchy:
yj | µ, φ ∼ Ga(µ, φ)
µ | a, b ∼ IG(a, b)
φ ∼ C+(0, 1) where y is the data, a and b are the hyperparameters for the inverse-gamma prior distribution for µ,and n.samples is the number of samples to generate. The hierarchy uses a default weakly informative half-Cauchy prior for φ, so we do not need to specify any hyperparameters for this prior. Using the daily case numbers from the previous four and a half months, we fifind that a reasonable prior guess at the average daily case numbers is around 8,500, with a 95% prior probability of the average daily case numbers ranging anywhere from 5,000 to 13,000 cases.
- Select appropriate values of a and b for our inverse gamma prior distribution to (approximately) encode the specifified information (hint: use the pinvgamma() function). [2 marks]
- Generate 100, 000 samples from the posterior distribution using your specifified prior hyperparameters. Plot the prior and posterior distributions of µ (hint: use the dinvgamma() function and histogram, as appropriate). [2 marks]
- Report on the posterior mean and 95% credible intervals for µ. How do these compare to our maximum likelihood estimates? (hint: use the quantile() function) [2 marks]
- Do you think the Bayesian analysis supports the hypothesis that the average daily case numbers for July, 2022 are the same as June, 2022? [1 mark]
- It is important for authorities to have an idea on the likely range of daily case numbers. Provide an estimate of the 97.5-th percentile of daily case numbers using your Bayesian samples. Provide a 95% credible interval for this estimate (hint: use the qgamma.mu() function provided).[2 marks]
EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!
E-mail: email@example.com 微信:easydue