Spring 2012 - ISTA 410/510: Bayesian Modeling and Inference

Assignment One (Due Wednesday January 23)
Ugrad: 10 points
Grad: 10 points
This assignment should be done individually

The purpose of this assignment is to become familiar with Matlab, as well as explore a few ideas to warm-up. Matlab is useful for exploring ideas and prototyping programs. It is a popular programming environment for machine learning, especially for those who do not have lots of experience in C/C++ because it allows one to focus more on the problem and less on the code. It has significant drawbacks for writing large programs or when performance is critical. It is also a "product" which needs to be purchased and installed before it can be used. Nonetheless, it is a useful tool that machine learning students should be exposed to. Some future assignments may either require Matlab, or be optionally done in Matlab.

U. Arizona has a site license for Matlab and it is already installed on many machines around campus, or could be installed (e.g., onto grad student office workstations). In addition, Matlab is available for personal use to UA faculty, staff and students. This will be the most convenient option for many.

Specifically, Matlab can be downloaded from http://sitelicense.arizona.edu/matlab/ by any University of Arizona faculty, staff or student with a netid. The site-license includes MATLAB, Simulink, plus 48 toolboxes.

If none of the above options for using Matlab are workable for you, let the instructor know ASAP.

In the following, a number of parts do not have "deliverables"; they are just to help you learn the tool. As will be the convention in this course, non-graded questions will be flagged with "@".

Important additional information

You must consult ISTA-410-510-assignment-instructions.pdf for detailed instructions for preparing assignments for this course.

The first five questions are optional and will not be graded. They are recommended if you are not familiar with Matlab. In the subsequent twelve there are six questions required for undergraduates, and four more that are required for grad students (*). There are also two more extra problems (**). See the document above for information about substitutions and bonus marks. In summary, undergraduates need to hand in six problems, and grad students ten.

Become familiar with Matlab (@)
Read this short Matlab tutorial and be aware of this longer Matlab primer . The complete documentation for Matlab is also available on the web. All Matlab commands are well documented with Matlab's help system. For example, if you want to know how the svd function works, type help svd. You can also get help on all the built in operators and even the language itself. Typing help gives a list of help topics.
Reading and Displaying Matrices (@)
Download the matrix file located here to your working directory and load it into a variable using load. Type help load to find out how to do this.
Tip: You may want to turn on paging (more on) for reading help pages.

Tip: Make sure you put a semicolon at the end of your load command. The semicolon at the end of a line prevents Matlab from displaying the result of an expression. Most of the time, you want the semicolon.

Get the width and height of the matrix using size. The size function, as is common in Matlab, can return multiple arguments.
Hint: To store both the width and height into variables in one go, try [h,w]=size(m). You could alternately do h=size(m,1) and w=size(m,2).

What data structure is used to represent the matrix? Type whos to get a list of active variables along with their types and sizes. You can see that your matrix is a 2D array of floating-point values.
What is the range of values contained in the matrix? Use the min and max functions to do this.
Hint: These commands by default operate on only one dimension of their argument. Convert your matrix into a 1D array (a vector) using the (:) notation when you pass it to min and max. Hint on hint: Do "help colon" to find out more about the ":" notation.
Use the max and min values to scale the values in your matrix so that they lie in the range [0,1].
Create a new figure using figure, and use imagesc to display the matrix as a grayscale image in it. Why does it look so strange? Type colorbar to find out. You can see that 0 maps to blue, 0.5 to green, and 1 to red. This is the default jet colormap that is useful for data visualization. For this example, we might prefer the grayscale colormap. Type colormap(gray) to do this.
Tip: Type help gray to see what default colormaps are available. Note that you can also make your own colormaps.

Tip: The imagesc command takes a second argument that lets you specify the range of values. By default, imagesc scales the data to use the full colormap, but this is not always what you want. In this example, we could add the argument [0 1] to the imagesc command so that the gray values are displayed faithfully.

The image is probably distorted, i.e. the pixels aren't square. Use the axis command to fix this.
Tip: When you display matrices as images, you usually want the axes to be scaled so that the pixels are square and the image not distorted. See help axis to determine how to do this. You can also turn the display of the axes on and off with the axis command.

Make a grayscale image that is the same size as the one we have been using, except create the values to be uniformly random (see the rand() function). Does it look like it what you expected? Create a second such image. Does it look much different?
Write the matrix to a file called random.txt using the save function and the -ascii flag, and confirm 1) that you can read the result by opening the file in a text editor, and 2) that you can load it back into another variable and display it to get the same image as before.
Files and Paths (@)
The interactivity of Matlab is great for debugging and experimenting, but usually one wants to type code into a file. Create a file hw1.m and put the commands from the previous task (reading the matrix, scaling, etc). If you just typed them, they should be around for cut and pasting. Now type hw1 at the Matlab prompt to execute the commands in the file. A file of this sort is known as a script.
Tip: Matlab has pwd, ls, and cd commands like the shell.

Tip: If your script is in some other directory than pwd, then you can add that other directory to Matlab's search path with the addpath command. The current directory is in Matlab's search path by default.
Functions (@)
You can define a new Matlab function by simply putting code in a file (as we have just done) and placing a function declaration at the top. Write the following as the first line of your hw1.m file:
function [num_points] = hw1(infile, outfile)

Tip: The name of the dot-m file (without the dot-m) and the name of the function should always match. A file can export only one function, but the file may contain internal functions.

Modify your code so that the load and save commands use the infile and outfile variables instead of hard-coded filenames. Also, make it so that every question that you implement adds its number of points to a variable num_points so that it gets returned by the function. If the hw1.m file is in your pwd or in your Matlab path, then you can execute this function by typing hw1('in_matrix.mat','out_matrix.mat'), or my_points=hw1('in_matrix.mat','out_matrix.mat') if you want the number of points returned in a variable.
Tip: If you want to return more than one variable from a function, add it to the list in square brackets in the function's declaration. Simply setting these variables inside the function will cause them to be returned.

Tip: If you want a Matlab function to modify a variable, then you have to duplicate it in the input and output lists. Matlab behaves like a functional language in this regard.

Tip: Functions can also have optional input and output arguments. To determine the actual number of input and output arguments a function was called with, look at the special variables nargin and nargout.
Documenting Functions (@)
When you create a new function, it should always be documented so that help returns something informative. The convention in Matlab is to place the help message in comments after the function declaration (the comment character is "%"). For an example, you can look at some of the code in the Matlab library. The jet command that we used above to set the colormap exists as some .m file on the system. Type which jet to locate it, and then use the type command to view the source code. You see that the first line of the comment contains a one-line description of the command. All subsequent contiguous comment lines are included in the help message. Document your hw1.m function in this style.
Random numbers
Use the Matlab function rng(0) to set the random number generator seed to 0, and use randi() to produce 1000 throws of two (6-sided) die. Use the result to estimate the probability of double sixes. Report what you did and the result ($). Now run your code 9 more times without touching the random number seed. Report the results, and comment how many times you got the same estimate as the first time ($). Finally, set the seed to 0 a second time, and report whether you get the same result as the first time ($). Explain why it is often important to have random number sequences that are not really random, and can be controlled ($).

Hint: You might find use for the function find (but there are lots of ways to do this).
Random images
In an optional question above, you were asked to generate random grayscale images of the same size as the tiger image which had size (236, 364). Suppose that you are allowed 8 bits per image which is the format of the original tiger image. Can you estimate how many times on avarage you would have to sample images in this way to get the exact tiger image? ($).
Random images (II) (*)
Repeat the previous question, but for the generation of a an image that is similar enough to the tiger image that a person would probably assume that the images are basically the same. You will have to make some reasonable but somewhat arbitrary assumptions here. There is no well defined answer here, but a good answser will state assumptions clearly, and in such a way that they could be tweaked and tested ($).
Random images (III) (**)
Repeat the previous question, but for the generation of an image that a human would declare is an image of something in the world, rather than the result of someone playing with Matlab. Again, we are more interested in the analysis than the result (which is not known). Some creativity will be needed to approach this problem ($).
Plotting
Explore the plot command. Plot the sin function over the domain -pi:pi. Use the linspace command to define the domain x and then do plot(x,sin(x)). Use the hold on command to plot another function on the same graph, such as cos. Use a different color, e.g. plot(x,cos(x),'r'). The running of hw1.m should produce a plot along these lines. Put the plot into your PDF with an informative caption. ($)
Use the hist command to plot a histogram of the element values of the same matrix used in the optional question on reading and displaying matrices, also linked here. Use the h=hist(y,x) form so that you can control the bin centers and so that you can plot the result using plot(x,h). Use 20 bins ($). Put the plot into your PDF with an informative caption ($).
Tip: The hist() function with a matrix argument does not quite do what you expect (check the documentation). If you need to convert a matrix to a vector, you might find the function reshape helpful.

Further develop (or show off) your knowledge of Matlab by making the same plot, but now with bar heights normalized so that they sum to 1 ($), and a second plot where the bar heights are set to log of one plus the original (non-normalized) bin counts ($) (be sure you know why we would use one plus log() and not just plain log()). Put the plots into your PDF with an informative captions ($).
Integration
In Matlab, we approximate continuous functions by sampling them at frequent intervals, and storing the values a vector. Derivatives and integrals can then be approximated by finite differences and Reimann sums, respectively. In this class, for example, we might use integrals to evaluate the probability of an event under a continuous probability distribution.
Assume the heights of adult men follow a normal distribution with mean 70 inches and standard deviation of two inches. Let's evaluate the probability of a man having a height between 68 and 80 inches.
The normal distribution function is defined as:
f(x) = 1/(sqrt(2*pi) sigma) e^(-(x - mu)^2 / (2 sigma^2)
Here, sigma is 2 inches, and mu is 70 inches.
Begin by creating a vector of x values, evenly spaced over the range [68, 80] at increments of 0.1 inches.
Next, compute a y vector, containing the the values of f(x).
Plot the result using plot(x,y), which should look like part of a bell-curve ($). This is the region of the function we will be integrating. Notice that the top of the bell isn't smooth, but instead comes to a point. This is because we're discretizing a continuous function by sampling it at intervals of 0.1. Decreasing the interval between x-values will make the function appear smoother, but will require more computing time to operate on.
Now compute the Reimann integral, using the y values as the rectangle heights, and the delta-x (0.1 inches) as the rectangle width. This is equivalent to summing the values of y and multiplying by delta-x. The result is an approximate integral which represents the probability of a man being between 68 and 80 inches. Report your estimate ($).
Now compute the integral over a much wider range of values (say [20,120]). State in your writeup what you expect the value to roughly be ($). Since you have a good idea what the result should be, you can now experiment with changing the value for delta-x and observe the change in the result. Try this for some values of delta-x that both larger, equal to, and smaller than the one just used. Plot the error of the estimate verses delta-x (or perhaps log(delta-x) if that is more informative) ($).
Playing with Linear Algebra
Matlab is a great tool to for experimenting with linear algebra.
Use the fact that inv() inverts a matrix to solve for X=(x,y,z):
```
   3*x + 4*y +   z = 9
   2*x -   y + 2*z = 8
     x +   y -   z = 0 
```
Verify that your "solution" works and provide the output that shows it ($). Make sure that you give the answer in what you hand in ($).
Playing with Linear Algebra (II) (*)
If there are more equations than unknowns, then, in the general case, "classically" the problem is over constrained and there is no solution. However, in this course, we will often be assuming that such equations are approximations and have errors due to noise or other reasons, and that an exact solution cannot be found regardless. Thus we will want to find the "best" solution. This is known as solving the equations in the least squares sense. The solution for AX=b, where A has more rows than columns, is given by X=inv(A'*A)A'b, where inv(A'A)A' is known as the Moore-Penrose inverse of A. Use this to solve for X=(x,y,z) in:
```
   3.0*x + 4.0*y +  1.0*z = 9
   3.1*x + 2.9*y +  0.9*z = 9
   2.0*x - 1.0*y +  2.0*z = 8
   2.1*x - 1.1*y +  2.0*z = 8
   1.0*x + 1.0*y -  1.0*z = 0 
   1.1*x + 1.0*y -  0.9*z = 0 
```
and report the result ($). Also report the the magnitude of the error vector, AX - b ($).
Playing with Linear Algebra (III) (*)
Recall that an eigenvector of a matrix A is a vector v, so that Av=kv, for some scalar constant k. If A is real and symmetric, then A has real eigenvalues and eigenvectors. Note that for a random matrix, R, R*R' is symmetric. (Try it!). The Matlab function eig() gives you eigenvectors and eigenvalues. Use these hints to give the TA a 3 by 3 matrix A, and a vector v, so that Av=kv, where k is a constant ($). In particular, the TA needs to see the output of Av./v which should be a vector of three elements that are all the same ($). Note that you will have to use the form of eig() that gives you two outputs (the default is just to give you one---you need to look at the documentation).

PCA
The data matrix pointed to here has two columns for X values and Y values, respectively. Read them in, and provide a plot ($). Compute the covariance matrix for the data and report it ($). Now consider the possibility that we can shift the data, as well as rotate it about the origin. Based on your plot, describe a more natural coordinate system for this data ($). Under such a new coordinate system, what would you expect the covariance matrix to be like (provide an "eyeball" estimate the elements, and explain) ($).
PCA II (*)
To map the data to a space which might be the one you specified in the previous question, first zero center the data so that the mean of both X and Y is zero. Second rotate the data by applying the appropriate orthogonal transformation, which will have the eigenvectors of the covariance matrix as columns. See the function eig to find that matrix. Provide the matrix and the computations which verify that it is orthogonal ($). Transform the data and provide a plot of it, using the same scale on the X and Y axes ($). Also, recompute the covariance matrix for the transformed data and provide the matrix ($). Does it make sense given your previous answers? ($). Note that it is easy to get the appropriate matrix transformation and its transpose mixed up, so if you are having trouble, try it the other way.
PCA II (**)
Assume that (X,Y) data is zero centered, and consider orthogonal transformations that can be applied to it. Derive a transformation that makes the variance of the first component maximum, and makes the covariance between the first component and others is zero. The previous problem should give you a hint about the form of the solution you are after (although note that Matlab orders eigenvalues small to big, rather than big to small as is more usual, and corresponds more cleanly to how I have phrased the problem) .