Spring 2004 - CS477/577 - An Introduction to Computer Vision

Assignment Five

Due: Thursday, April 8, 2004 (4:00 AM on April 9 for night owls).

Credit (U-grad): Approximately 6 points (Relative, and very rough absolute weighting)
Credit (Grad): Approximately 6 points (Relative, and very rough absolute weighting)

Like assignment 4, this assignment is meant to help you get familiar with some useful tools and ideas. If you are having trouble, please see Scott rather than beat your head against a wall. (Expend your creative energy on the project instead).

You can do this assignment in either Matlab or C/C++ using whatever libraries you like.

For those wanting to use the KJB library, I have updated the library with a few things to make this assignment a bit smoother. (Links to the example directory , the example program example.c , and the compile_line . (Those that are using the library more seriously in the long run should, at some point, talk to me regarding a more sophisticated way to use the library which is necessary if you are wanting to modify it). The files matrix.txt and image.tiff can be used as data for the example program.

If you are using the KJB library, the following routines might prove useful (man pages should be available, although the quality may not be high!):

complex_multiply_matrices_ew
complex_divide_matrices_ew
complex_get_magnitude_matrix_ew
get_matrix_dft
get_matrix_inverse_dft
set_wrap_fftw_style
Matlab users: Some functions which may prove helpful are:
fft2
ifft2

What to hand in: : Your source code, and executable (if C), a README answering the few questions asked as well as any other comments about your work which are relevant, and the result images requested for one input image. Also provide copies of any input images not supplied here (including derived images if you chose to crop one of the ones here).

To hand in the above, use the turnin program available on lectura (turnin key is cs477_hw5).


  1. Find the magnitude of the complex valued Fourier transform of this black and white image . Depending on the normalization settings (see below) and other factors, you may need to scale your result so that it looks reasonable. Hand this in as 1.jpeg

    I recommend understanding the following two points.

    Your result may look like an inside out version the images shown in class. This is essentially a by-product of how the discrete version of the transform is defined. It is common and reasonable to start frequencies at zero at one corner, and go to 1 at the opposite corner. However, f = 1-e, has the same effect as f = -e (recall that there is a factor of 2*pi in there). So the max frequency is in the middle, and the "negative" frequencies are then beyond the middle in reverse order. Clearly one could display the result differently and get pictures with 0 in the middle and negative and positive frequencies on either side. If you do it this way, than that is fine, also.

    Normalization is an issue because different authors and different software defines the Fourier transform and its inverse with different factors. A relatively common way (this is the default for the fftw library used by the KJB library) is to compute unnormalized transforms. If you do this, then taking the transform, followed by its inverse requires division by M*N (M=num_rows, N=num_cols) to get what you started with. Matlab computes the same thing in the forward direction, and then does the division by M*N for you on the inversion. I find it is somewhat easier for debugging to divide by sqrt(M*N) in both directions. This is one way to do normalized transforms. In order to simplify matters, I have added a routine to the KJB library set_wrap_fftw_style() where you can request the default from the underlying library, match Matlab, or normalize as described above. When I did some of this assignment to test things, I used the normalize "style".

    Regardless of what normalization you are using, you will have to occasionally multiply/divide by MN or sqrt(MN). Once you combine results in frequency space, the count of the number of normalizations required can be confusing. If you use the real versions (cosine transform), you will further sometimes be out by a factor of 2 or 4.

  2. Now do the same for this image . Hand this in as 2.jpeg. How are the results different? Does the difference make sense (answer in your README).

  3. Use your code from the previous assignment to blur this input image with a Gaussian mask of size 4 pixels (you may already have this result). Now use the convolution theorem (see below) to do the same thing and check that the results are approximately the same. (You may need to be careful defining your convolution mask, and dealing with normalization as described above). Hand this in as 3.jpeg. Compare the results. Do they make sense?


    In case you have forgotten the convolution theorem

    Let X be an matrix of complex numbers. That is X = A + iB, where A and B are real matrices. Let @ represent convolution, and use FFT(X) for the Fourier transform of X, and use IFFT(X) for the inverse Fourier transform of X. On a good day X = IFFT(FFT(X)).

    The convolution theorem says that

        FFT(A @ B) = FFT(A) FFT(B)    
    
    This means that you can convolve using:
        A @ B = IFFT( FFT(A) FFT(B) )
    
    Now suppose that you have A @ B, and you have (or can guess at) B. To get A:
            FFT(A @ B) = FFT(A) FFT(B) 
    so
            FFT(A) = FFT(A @ B) / FFT ( B)
    so 
            A = IFFT ( FFT(A @ B) / FFT ( B) )
    
    This can be used to reverse the effects of a assumed convolution.

    Important: Unfortunately, this trick cannot yield a perfect result. First, mask frequency space energies close to zero indicate fundamentally unrecoverable information (as well as a risk of dividing by zero). Second, noise is added on top of A@B, which fouls up recovery (which is most effective if you have a good understanding of your noise). Finally, there are always edge effects. Reak success requires being sophisticated about these issues. However, for this assignment I suggest simply consider thresholding FFT(B) in some way to reduce noise amplification and to avoid dividing by zero.


  4. First make sure that you can "undo" your blur added using the Fourier transform from the above question. Since you are simply checking that you can remove what you put there, the issues in the previous paragraph are not that relevant and you can get away with little or no threshold. Now, try to undo the blur from this image , which was produced by regular convolution by a Gaussian with sigma = 3.0. Since the blur was synthetically generated, and I did not add any noise, you canget away with a relatively small threshold (check the effect of varying it).

    A superlative result is hard to get (see note above), and is not required. But do convince yourself (and your TA), that some substantive blurring has been reduced, even if it is accompanied by the introduction of other undesirable artifacts.

    Hand this in as 4.jpeg.

  5. (Optional for U-grads)

    One crude approximation of a substantively out of focus image is that it is an in focus image convolved with a disk filter (the filter is uniform on the disk, zero outside). Does this make sense? Put your explanation into your README (or, if you think that some other kind of filter is a better approximation, see if it works better).

    Try to deblur this out of focus image . Note that this image started its life as a jpeg compressed image (adds noise). Further, the disk filter is only an approximation to whatever process actually occurred in capture. Thus you may find that a relatively higher threshold is needed. There are better ways than simple thresholding. If you use a better method, explain what you did in a README file.

    To handle color, you likely want to try handling each channel individually.

    Again, a superlative result is hard to get (see note above), and is not required. But do convince yourself (and your TA), that some substantive blurring has been reduced, even if it is accompanied by the introduction of other undesirable artifacts.

    Hand this in as 5.jpeg.