Unless otherwise specified, questions have unit value. The total value of the
assignments from each week will vary substantively.
Recall that assignments are graded rather loosely on effort, and that 3/4 of
the total marks (1/2 for ugrads) over all assignments over all weeks
represents 100%. This policy is in place partly to allow for error in the
grading approach which, by necessity, is somewhat subjective, and needs to be
done somewhat superficially. It is recommended (and requested) that you try to
overshoot the 3/4 requirement, rather than worry about the details of how the
grading is done.
Problems denoted EXTRA can be substituted for other problems, or done in
addition, but they do not count towards the computation of the 3/4
requirement. They may be discussed in class depending on time and interest.
They are problems that I think might be useful, and likely be assigned if we
had more time per chapter.
Sometimes you will explicitly have to choose some of your own problems. Even
when this is not the case, you can substitute some problems in the book for
non-programming assignments if they appear more helpful to you. For now, limit
the number of substitutions to 50% of what you hand in. This parameter may be
increased or decreased as we go on.
You are encouraged to discuss the problems with your peers, but I would like
individual final submissions demonstrating effort and understanding of what
was done. If you end up working closely with someone on a problem set, make a
note on your submission saying who it was. For programming assignments, each
person should turn in their own work.
Since this is graduate level research course that is graded predominately on
effort, I am confident that there will not be any problems with academic
honesty. However, do note that non-negligible deviations are often
surprisingly easy to spot, and can be verified by discussing the submitted
solutions with the student.
-----------------------------------------------------------------------------
Problems for Week 8.
Total value is 9.
Due Thursday. It is possible that we will discuss some of the first ones in
Tuesday's class so it is appreciated that the majority of the class has begun
to think about them by Tuesday.
-----------------------------------------------------------------------------
1. First, if your memory on eigenvectors is hazy, review the material in
appendix C, starting at the bottom of page 698 (nothing needs to be handed
in). Second, some of the development of PCA makes better sense if you
realize that the eigenvalues of the covariance matrix are 1) real; and 2)
non-negative. Explain or show why this is the case.
2. Fill in the details to get 12.5.
3. Explain how a PCA based model can be viewed as a generative model. Consider
if this interpretation provides any intuition as to when PCA makes sense.
For example, consider the case where a GMM with 3 clusters is a very good
model. Intuitively, would PCA be just as good?
4. (The following problem description is long, but the solutions is not).
Getting eigenvalues/eigenvectors of a matrix A is referred to as the
eigenvector decomposition of A (A = V*D*V'). Another important matrix
factorization is singular value decomposition (SVD). Here A=U*S*V' where S
is a diagonal matrix of singular values, and U and V are orthogonal.
Importantly, A need not be square, and the SVD always exists. SVD is
available in Matlab as svd().
Lets put the N data points into rows of a data matrix D. For convenience,
assume that the mean of the data has been subtracted, so that the mean of
the data is zero. Then, in the PCA development, we use the eigenvectors
from the eigenvector decomposition of the covariance matrix, C=(1/N)*D'*D.
Consider the SVD of the matrix D, and plug it into the covariance formula
to yield and alternative way to get what is needed for PCA. (This is useful
because SVD is a very stable computation).
Having got this far, I suggest using Matlab to generate a random matrix D,
and verifying the above. In general, Matlab is a great tool for verifying
your understanding of the math.
5. Value is FIVE
For this problem, you will implement some aspects of PCA. While there may
exist software to do some of these tasks, for the purposes of this
assignment, you should implement things at the matrix level. Of course, you
can make use of facilities like eigenvalue solvers (e.g. Matlab eig() is
OK).
It is common to make the data have zero mean before doing PCA (some sources
even assume that this is part of PCA), especially when the data is all
positive as is the case here. To promote consistency among solutions, lets
work with data that has zero mean (i.e. subtract the mean from any data
that you analyze with PCA).
I have generated face and non-face data similar to that for assignment 3.
except that I have broken the images into a 12 x 12 grid (not 7 by 7).
Specifically, the 144 numbers refer to block averages of gray values, taken
one row at a time.
Some face data is linked here.
Some non-face data is linked here.
Do PCA on the face data. Make an image out of the first eigenvector. (This
is similar in spirit to figure 12.3). Is it reasonable? Do the same for the
non-face data. Are the images noticeably different?
(Depending on your background, creating images will either be trivial,
worth learning about, or not worth the trouble).
Combine the two data sets, and do PCA. Plot the eigenvalues in decreasing
size. If you had to choose a good number of eigenvalues to capture the
data, what would it be?
Compute the distortion as developed in section 12.1.2 for your chosen
number of eigenvalues. Verify computationally that the distortion is also
available from a sum of eigenvalues (i.e., equation 12.18).
Plot the projections of the face and non-face points onto the two
dimensional subspace defined by the first two eigenvectors. The face and
non-face projected points should be distinguishable (e.g. different color).
Does this suggest a method for classification?
(While arguably this is a pretty naive approach, it did get some notoriety
in the early 90's. As usual, Wikipedia knows all: see this link.).