MATH 472/572 Computational Statistics – Spring 2020
Homework 6 – Due March 12, Thursday
Instructor: Leming Qu
Rules for HW:
 You are allowed to discuss HW with fellow students in the course, but the work you hand in
 You have to write your own Python code by yourself. You are prohibited from sharing,
copying or editing any Python code from other students.
How to turn in your coding portion of the HW?
 Submit your code in Jupyter Notebook format (.ipynb file) through the blackboard HW link.
The deadline for code submission is the class starting time 1:30PM of the due date.
 Required output (prints and plots) must be included in the Jupyter Notebook – do not expect
the Grader to run the code to see the required output. If the required output is not included
in the Jupyter Notebook, the grader will take points off accordingly.
Coding Assignments:
1. Use the EM algorithm to fit a (1 − π) : π mixture of two Poisson distributions, Poisson(λ1)
and Poisson(λ2), to the following data:
Value 0 1 2 3 4 5 6 7 8 9
Frequency 162 267 271 185 111 61 27 8 3 1
(a) Derive the EM algorithm for the maximum likelihood estimates of π, λ1, λ2. Present your
derivation in a Markdown cell in the Jupyter Notebook.
(b) Implementing the EM algorithm for this data set. Present the output of your code in
the format similar to Table 4.1 on page 102 of the book Computational Statistics.
(c) In a single plot, show the relative frequency and fitted probability for the observed data,
respectivelly, with appropriate legend.
(d) What is the probability that the vaue 10 will be observed ?
2. Suppose that the probability density function (PDF) of a bivariate random vector X =
(X1, X2)
T
is a mixture of bivariate normal:
f(x; θ) = (1 − π)φ(x; µ1
, Σ1) + πφ(x; µ2
, Σ2),
where φ(x; µ, Σ) is the bivariate normal PDF with mean µ and covariance matrix Σ, and
θ = [π, µ1
, µ2
, Σ1, Σ2].
The data set data mvnorm2mix.csv is a a random sample from this two normal mixtures, implement
the EM algorithm to find the maximum likelihood extimate (MLE) of the parameter
θ. Present your result as following:
1
(a) Print the MLE of the parameter θ.
(b) Plot the value of the log-likelihood function vs iteration number. Comment on the
pattern of the plot.
(c) Classify each observation into one of the two classes. Display the classification results in
a scatter plot with two different colors.
(d) Plot a surface plot of the fitted PDF f(x; θˆ) .
Non-Coding Assignments:
3. Let {a1, . . . , an} be a set of positive real numbers. Its arithmetic mean (AM), geometric mean
(GM), harmonic mean (HM) are defined as:
AM =
1
n
Xn
i=1
ai
,
GM =
Yn
i=1
ai
!1/n
,
HM =
1
1
n
Pn
i=1
1
ai
.
Prove:
HM ≤ GM ≤ AM.
2