HW#4: From the Features to clustering and classification



I. Parametric Methods

1. The normal density function can be written in the form:

(1)

Prove:
The mean 
variance 

 
2. In d-dimensions the Gaussian distribution is:
 
(2)
where is a d-dimensional vector, is d*d covariance matrix.

 
 

2.1 Show that there exists a transformation to a new coordinate system, defined by the eigenvectors of , such that the transformed variables can be written as .

2.2 Show: 

3. Parametric Estimation using the Likelihood Function:
The maximum likelihood technique estimates a parameter set , by maximizing :
(3)
a data set of N vectors.
In practice, we define an error function, E, to be minimized:

(4)

For the Gaussian distribution in one dimension (equation 1), find the values of the mean and variance that minimize the error function (equation 4).
 
 

II. Non-Parametric Methods (**requires coding.
                                                   Additional info + data links
                                                   will be provided)
Code the K-means algorithm for clustering samples into K Gaussian clusters (follow the pseudo code given in class), given the following training set (samples in X-Y coordinates).
Output: the optimal K you find for the data with the corresponding K-means and covariance matrices.

 
 

III. Semi-Parametric Methods: Gaussian mixture models

Go over section 2.6 in Bishop on mixture models.