HW#4: From the Features to clustering and classification
I. Parametric Methods
1. The normal density function can be written in the
form:
(1)
Prove:
The mean

variance

2. In d-dimensions the Gaussian distribution is:

(2)
where

is a
d-dimensional vector,

is
d*d covariance matrix.
2.1 Show that there exists a transformation to a new
coordinate system, defined by the eigenvectors of
,
such that the transformed variables
can
be written as
.
2.2 Show: 
3. Parametric Estimation using the Likelihood Function:
The maximum likelihood technique estimates a parameter set

,
by maximizing

:

(3)

a data set of
N vectors.
In practice, we define an error function, E, to be minimized:
(4)
For the Gaussian distribution in one dimension (equation 1),
find the values of the mean and variance that minimize the error function
(equation 4).
II. Non-Parametric Methods (**requires
coding.
Additional info + data links
will be provided)
Code the K-means algorithm for clustering samples into K
Gaussian clusters (follow the pseudo code given in class), given the following
training set (samples in X-Y coordinates).
Output: the optimal K you find for the data with the corresponding
K-means and covariance matrices.
III. Semi-Parametric Methods: Gaussian
mixture models
Go over section 2.6 in Bishop on mixture models.
-
Explain in your own words the intuition behind the update equations (2.96-2.98).