Spring 2011 - ISTA 410/510: Bayesian Modeling and Inference

Assignment One (Optional, Not Graded)
Due: Since this will not be graded, there is no due date. However, I suggest a target date of Wednesday, January 26 because the second assignment is looming.

(Much debt is owed to David Martin for parts of this assignment)

The purpose of this assignment is to become familiar with Matlab. Matlab is useful for exploring ideas and prototyping programs. It is a popular programming environment for computer vision, especially for those who do not have lots of experience in C/C++ because it allows one to focus more on the problem and less on the code. It has significant drawbacks for writing large programs or when performance is critical. It is also a "product" which needs to be purchased and installed before it can be used. Nonetheless, it is a useful tool which computer vision students should be exposed to. Some future assignments may either require Matlab, or be optionally done in Matlab.

Matlab is installed on the UA CS linux machines (the path is /usr/local/bin). Normally, to begin, you just type "matlab" at the shell prompt. For example, you may want to use the machines gr01 through gr08 in GS 920, either on-site, or remotely (remote address: gr01.cs.arizona.edu, gr02.cs.arizona.edu, etc). If your DISPLAY environment is properly set, the default behavior is the GUI interface. Otherwise, you will get the the command line interface. This GUI is required for much of this assignment. You may find that the GUI is slow to start up, and painfully slow to use over a slow connection. For a faster startup (but without some of the features needed for some parts of this assignment), you can try

        matlab -nodisplay -nojvm

Note that in following, a number of parts do not have "deliverables"; they are just to help you learn the tool.

Matlab is available for personal use to UA faculty, staff and students. It can be downloaded from http://sitelicense.arizona.edu/matlab/ by any University of Arizona faculty, staff or student with a netid. The site-license includes MATLAB, Simulink, plus 48 toolboxes. This license is available for one year. At the end of the year term, partners will assess continuance of the project.

IMPORTANT NOTE. Even if you develop programs on some other machine, they must work on our refernece machines, gr01 through gr08.

Become familiar with Matlab.
Read this short Matlab tutorial and be aware of this longer Matlab primer . The complete documentation for Matlab is also available on the web. All Matlab commands are well documented with Matlab's help system. For example, if you want to know how the svd function works, type help svd. You can also get help on all the built in operators and even the language itself. Typing help gives a list of help topics.
Reading and Displaying Matrices
Download the matrix file located here to your working directory and load it into a variable using load. Type help load to find out how to do this.
Tip: You may want to turn on paging (more on) for reading help pages.

Tip: Make sure you put a semicolon at the end of your load command. The semicolon at the end of a line prevents Matlab from displaying the result of an expression. Most of the time, you want the semicolon.

What data structure is used to represent the matrix? Type whos to get a list of active variables along with their types and sizes. You can see that your matrix is a 2D array of floating-point values.
What is the range of values contained in the matrix? Use the min and max functions to do this.
Hint: These commands by default operate on only one dimension of their argument. Convert your matrix into a 1D array (a vector) using the (:) notation when you pass it to min and max.
Use the max and min values to scale the values in your matrix so that they lie in the range [0,1].
Create a new figure using figure(1), and use imagesc to display the matrix as a grayscale image in it. Why does it look so strange?! Type colorbar to find out. You can see that 0 maps to blue, 0.5 to green, and 1 to red. This is the default jet colormap that is useful for data visualization. For this example, we might prefer the grayscale colormap. Type colormap(gray) to do this.
Tip: Type help gray to see what default colormaps are available. Note that you can also make your own colormaps.

Tip: The imagesc command takes a second argument that lets you specify the range of values. By default, imagesc scales the data to use the full colormap, but this is not always what you want. In this example, we could add the argument [0 1] to the imagesc command so that the gray values are displayed faithfully.

The image is probably distorted, i.e. the pixels aren't square. Use the axis command to fix this.
Tip: When you display matrices as images, you usually want the axes to be scaled so that the pixels are square and the image not distorted. See help axis to determine how to do this. The axis image command is a useful one. You can also turn the display of the axes on and off with the axis command.
Files and Paths
The interactivity of Matlab is great for debugging and experimenting, but usually one wants to type code into a file. Create a file hw1.m and put the commands from the previous task (reading the matrix, scaling, etc). Now type hw1 at the Matlab prompt to execute the commands in the file. A file of this sort is known as a script.
Tip: Matlab has pwd, ls, and cd commands like the shell.

Tip: If your script is in some other directory than pwd, then you can add that other directory to Matlab's search path with the addpath command. The current directory is in Matlab's search path by default.
Manipulating Matrices
In this section, we will work with the matrix that you should have displayed in figure(1).
Tip: Array indices in Matlab start at 1, not at 0.

Tip: As in standard mathematical notation, the first index of a matrix is the row, and the second index the column. When viewing a matrix as an image, this means that the first index is the y direction, and the second the x direction. As is usual with image, the origin is at the top left corner, the positive x axis points right, and the positive y axis points down.

Get the width and height of the matrix using size. The size function, as is common in Matlab, can return multiple arguments.
Hint: To store both the width and height into variables in one go, try [h,w]=size(m). You could alternately do h=size(m,1) and w=size(m,2).

Write a couple of nested for loops to set every 5th element, horizontally and vertically, to 1. When you visualize this matrix with imagesc, it should set 1/25th of the elements to "white" in a square lattice pattern.
Hint: Use the colon operator to define the limits of the for loops. See help colon and help for to see how to do this. Specifically, you want the minval:interval:maxval form.

Set all the same elements to 0, making the ones that you just set to white become black. Do it this time without using any for-loops.
Hint: You can index arrays in Matlab with vectors as well as with scalars, so m(Y,X)=0 will set multiple entries of matrix m to zero when either X or Y are vectors, such as the vectors returned by the colon operator.

Set all the elements whose values are greater than 0.9 to zero. The command to do this is m(find(m>0.9))=0. It would be good for you to understand why this works, but if this is just too confusing, move on for now.
Hint: The (m>0.9) expression evaluates to a boolean matrix, which is true where the condition holds. The find function returns a vector containing the indices of the true values. This vector is then used to index the matrix and set all the values to zero. Why does this last step work when the matrix is 2D and the indexing is 1D? Matlab lets you treat matrices as 1D vectors too, linearizing the matrix in column-major order.

Tip: Matlab makes it easy to write "vectorized" expressions without having to write for loops or if statements. For example, this will add all the values of the matrix:
sum(m(:))
The following will count the number of values greater than 0.9:
numel(find(m>0.9))
So will this:
sum(sum(m>0.9))
This will halve only those values greater than 0.9 (note the use of the .* operator to do element-size multiplication of matrices):
m = m - 0.5*m.*(m>0.9);
And so will this:
mask = (m>0.9); m = m.*~mask + m.*mask*0.5;
This will set 100 unique random elements to zero:
p = randperm(numel(m)); m(p(1:100)) = 0;
See help elmat for a list of interesting matrix manipulation and creation routines.

Extract a rectangular region of the matrix into another variable. Use notation like m2 = m(colmin:colmax,rowmin:rowmax) to do this. Use the getrect function to visually select the rectangle you want from figure(1). Create a new figure with figure(2), and then display the extracted region in that figure using imagesc. Fix the axes so that the pixels are square.
Tip: getrect returns floating-point values that you must convert to integers before using them as indices. Use the round function for this.

Tip: In addition to getrect there are also getline and getpts functions. Also, the ginput function is an extremely useful function to know. It allows you to retrieve mouse clicks and key presses from a figure window.
Writing Matrices
The save function allows you to write matrices to a file. If you use the -ascii flag, the output file will be readable in a text-editor.

Write the matrix to a file called out.mat using the save function and the -ascii flag, and confirm the result by opening the file in a text editor.
Functions
Go back to your hw1.m file and add the code to set every 5th horizontal/vertical element to 0, extract a user-selected matrix region using getrect from figure(1), display the new matrix in figure(2), and write the matrix to a file.
You can define a new Matlab function by simply putting code in a file (as we have just done) and placing a function declaration at the top. Write the following as the first line of your hw1.m file:
function [matregion] = hw1(infile,outfile)

Tip: The name of the file and the name of the function should always match. A file can export only one function, but the file may contain internal functions.

Modify your code so that the load and save commands use the infile and outfile variables instead of hard-coded filenames. Also make sure that the extracted matrix region is stored in a variable called matregion so that it gets returned by the function. If the hw1.m file is in your pwd or in your Matlab path, then you can execute this function by typing hw1('in_matrix.mat','out_submatrix.mat'), or submatrix=hw1('in_matrix.mat','out_submatrix.mat') if you want the submatrix returned in a variable.
Tip: If you want to return more than one variable from a function, add it to the list in square brackets in the function's declaration. Simply setting these variables inside the function will cause them to be returned.

Tip: If you want a Matlab function to modify a variable, then you have to duplicate it in the input and output lists. Matlab behaves like a functional language in this regard.

Tip: Functions can also have optional input and output arguments. To determine the actual number of input and output arguments a function was called with, look at the special variables nargin and nargout.
Plotting
Explore the plot command. Plot the sin function over the domain -pi:pi. Use the linspace command to define the domain x and then do plot(x,sin(x)). Use the hold on command to plot another function on the same graph, such as cos. Use a different color, e.g. plot(x,cos(x),'r'). The runing of hw1.m should produce a plot along these lines.
Use the hist command to plot a histogram of the element values in the matrix you read in step 2. Use the h=hist(y,x) form so that you can control the bin centers and so that you can plot the result using plot(x,h). You may want to plot the log of the bin counts.
Your program should now (three times consecutively) prompt the user to select rectangular regions from the matrix (figure 1) for which you will plot a histogram. It will be instructive to compare the histograms of different parts of the matrix, and thus previous plots should stay visible while additional ones are added.
Integration
In matlab, we approximate continuous functions by sampling them at frequent intervals, and storing the values a vector. Derivatives and integrals can then be approximated by finite differences and Reimann sums, respectively. In this class, we will use integrals to (for example) evaluate the probability of an event under a continuous probability distribution.
Assume the heights of adult men follow a normal distribution with mean 70 inches and standard deviation of two inches. Let's evaluate the probability of a man having a height between 68 and 80 inches.
The normal distribution function is defined as:
f(x) = 1/(sqrt(2*pi) sigma) e^(-(x - mu)^2 / (2 sigma^2)
Here, sigma is 2 inches, and mu is 70 inches.
Begin by creating a vector of x values, evenly spaced over the range [68, 80] at increments of 0.1 inches.
Next, compute a y vector, containing the the values of f(x).
Plot the result using plot(x,y), which should look like part of a bell-curve. This is the region of the function we will be integrating. Notice that the top of the bell isn't smooth, but instead comes to a point. This is because we're discretizing a continous function by sampling it at intervals of 0.1. Decreasing the interval between x-values will make the function appear smoother, but will require more computing time to operate on.
Now compute the Reimann integral, using the y values as the rectangle heights, and the delta-x (0.1 inches) as the rectangle width. This is equivalent to summing the values of y and multiplying by delta-x. The result is an approximate integral which represents the probability of a man being between 68 and 80 inches.
Try changing the value for delta-x and observe the change in the result. In practice, it is important to understand the level of precision required for a task; increasing the rate at which you sample the continuous function (i.e. decreasing delta-x) will decrease discretization error at the expense of increased computation time.
Playing with Linear Algebra
Matlab is a great tool to for experimenting with linear algebra.
Use the fact that inv() inverts a matrix to solve for X=(x,y,z):
```
   3*x + 4*y +   z = 9
   2*x -   y + 2*z = 8
     x +   y -   z = 0 
```
Verify that your "solution" works. Make sure that you give the answer in what you hand in.

If there are more equations than unkowns, then, in the general case, "classically" the problem is over constrained and there is no solution. However, in this course, we will often be assuming that sych equations are approximations and have errors due to noise or other reasons, and that an exact solution cannot be found regardless. Thus we will want to find the "best" solution. This is konwn as solving the equations in the least squares sense. The solution for AX=b, where A has more rows than columns, is given by X=inv(A'*A)A'b, where inv(A'A)A' is known as the Moore-Penrose inverse of A. Use this to solve for X=(x,y,z) in:
```
   3.0*x + 4.0*y +  1.0*z = 9
   3.1*x + 2.9*y +  0.9*z = 9
   2.0*x - 1.0*y +  2.0*z = 8
   2.1*x - 1.1*y +  2.0*z = 8
   1.0*x + 1.0*y -  1.0*z = 0 
   1.1*x + 1.0*y -  0.9*z = 0 
```
What is the magnitude of the error vector, AX - b?
Recall that an eigenvector of a matrix A is a vector v, so that Av=kv, for some scalar constant k. If A is real and symmetric, then A has real eigenvalues and eigenvectors. Note that for a random matrix, R, R*R' is symmetric. (Try it!). The the matlab function eig() gives you eigenvectors and eigenvalues. Use these hints to give the TA a 3 by 3 matrix A, and a vector v, so that Av=kv, where k is a constant. In particular, the TA needs to see the output of Av./v which should be a vector of three elements that are all the same. Note that you will have to use the form of eig() that gives you two outputs (the default is just to give you one---you need to looks at the documentation).
Documenting Functions
When you create a new function, it should always be documented so that help returns something informative. The convention in Matlab is to place the help message in comments after the function declaration (the comment character is "%"). For an example, you can look at some of the code in the Matlab library. The jet command that we used above to set the colormap exists as some .m file on the system. Type which jet to locate it, and then use the type command to view the source code. You see that the first line of the comment contains a one-line description of the command. All subsequent contiguous comment lines are included in the help message. Document your hw1.m function in this style.

What to Hand In

Nothing. This is an optional assignment.