Spring 2005 - CS696i - Computational Intelligence

Computational Intelligence

Computer Vision Assignment One (Programming meta-assignment)
Due: Tuesday, March 22 (but late assignments accepted due to previous typo).

This meta-assignment is being graded on the basis of making a reasonable attempt. Thus no marks are taken off for incorrect answers. However, for those that want to confirm their understanding, answers are below:

I created some confusion by being a bit sloppy in the wording of the assignment. The correct implementation of some of the possible interpretations is quite tricky. For that are interested, I have provided my derivations of the formulas used below here.

Matlab has lots of tricks to make the expression of computations compact. In the extreme case, some students computed the answer to the first problem using the functions "mean" and "std". However, the answers below are meant to clarify what the computation actually is.

Compute the mean and the sample standard deviation of each column of the matrix x.

In the original version this read "error" not "deviation" which, quite rightfully, caused some confusion.

    % Simple method to compute the means of each column: 

    [m,n] = size(x);

    for col = 1:n
        sum_x = 0;

        for row = 1:m
            sum_x = sum_x + x(row, col);
        end

        mean_x = sum_x / m 
    end

A slightly different way, perhaps exposing some understanding in the weighted case:

    % Another way

    [m,n] = size(x);

    for col = 1:n
        sum_weight = 0;
        sum_x = 0;

        for row = 1:m
            %  The symbol += means add the item on the right to the quantity
            %  on the left.
            %
            weight = 1; 
            sum_weight = sum_weight + weight;
            sum_x = sum_x + weight * x(row, col);
        end

        mean_x = sum_x / sum_weight 
    end

The standard deviation is the square root of the variance. The variance is the expected value of the square of the deviation from the mean. When the probability (weights) are all the same (as in this case), this would be the average.

For the "sample" variance, divide by (m-1) instead of m (where m is the number of rows) to compute the averages. The reason for this correction of m/(m-1) as exposed by algebra (see the formulas) is a compensation for the expected error in the sample mean.

If you interpreted this part of the question to mean "standard error of the mean", then you would divide the quantity by sqrt(m) (see formulas).

    % Simple method to compute the means of each column: 

    [m,n] = size(x);

    for col = 1:n
        sum_x = 0;
        sum_dev_sqr = 0;

        % First get the mean as above. 

        for row = 1:m
            sum_x = sum_x + x(row, col);
        end

        mean_x = sum_x / m 

        % Now the variance and standard deviations

        for row = 1:m
            sum_dev_sqr = sum_dev_sqr + (x(row, col) - mean_x)^2;
        end

        var_x = sum_dev_sqr / m ;
        sample_var_x = sum_dev_sqr / (m - 1);

        stdev_x = sqrt(var_x)
        sample_stdev_x = sqrt(sample_var_x)
        error_of_mean = sample_stdev_x / sqrt(m)
    end

Allowing for my sloppy wording, you will likely find your answers somewhere in here:

    mean           0.3772  0.4254  0.4567
    stdev          0.2612  0.2483  0.3240
    sample stdev   0.2680  0.2547  0.3324
    error of mean  0.0599  0.0570  0.0743

We can compute the answer using very little memory and only one pass of the data by keeping a running tab of the number of points and the sufficient statistics: the sum, and the sum of squares. These can be put into a relatively simply formula for the mean (obvious) and the sample stdev.

Computing weighted mean and standard deviation

Again, sloppy wording in the original lead to some confusion. The following computes all the possible variants that I can think of.

% Simple method to compute the weighted means of each column: 

[m,n] = size(x);

for col = 1:n
    sum_weight = 0;
    sum_x = 0;

    for row = 1:m
        weight = row; 
        sum_weight = sum_weight + weight;
        sum_x = sum_x + weight * x(row, col);
    end

    mean_x = sum_x / sum_weight 
end

The variance is similar, and we take the sqrt() of the variance to get the answer. Again, getting the exact estimates was beyond the intended scope of the assignment, but I have implemented my answers to these as derived in the formulas for those interested.

    [m,n] = size(x);

    for col = 1:n
        sum_weight = 0;
        sum_x = 0;
        sum_dev_sqr = 0;
        sum_sqr_weight = 0;

        % First get the mean as before

        for row = 1:m
            weight = row; 
            sum_weight = sum_weight + weight;
            sum_x = sum_x + weight * x(row, col);
            sum_sqr_weight = sum_sqr_weight + weight*weight;
        end

        mean_x = sum_x / sum_weight 

        % Now to the weighted sum of squared deviations

        for row = 1:m
            weight = row; 
            sum_dev_sqr = sum_dev_sqr + weight * (x(row, col) - mean_x)^2;
        end

        var_x = sum_dev_sqr / sum_weight ;

        % The basic one (the one I meant). 
        stdev_x = sqrt(var_x)

        incorrect_error_of_mean = stdev_x / sqrt(m)

        sample_var_x = sum_dev_sqr / (sum_weight - sum_sqr_weight/sum_weight);
        sample_stdev_x = sqrt(sample_var_x)

        error_of_mean = sample_stdev_x * sqrt(sum_sqr_weight / (sum_weight^2) )
    end

The answers I get (my original intention was that one would go for the first 2 to keep things simple):

weighted mean           0.3295  0.4814  0.4492
weighted stdev          0.2556  0.2183  0.3410

divided by sqrt(20)     0.0571  0.0488  0.0763

weighted sample stdev   0.2643  0.2257  0.3527
weighted error of mean  0.0647  0.0576  0.0900

Some divided the second row by sqrt(20) due to the appearance that I was after the estimated error of the mean. However, I will argue that this is not the correct formula in this case. That answer corresponds the case that all points are uniformly weighted, which is ideal, and thus it under-estimates the error in more general cases. Consider, for example, the case that the weights were such that the first point had all the weight. Then sqrt(m) would be far too big a quantity to divide the stdev by.

As in the non-weighted case, this can be done with very limited memory and only one pass of the data by keeping a few running totals.

Computational Intelligence

Computer Vision Assignment One (Programming meta-assignment) Due: Tuesday, March 22 (but late assignments accepted due to previous typo).

Computer Vision Assignment One (Programming meta-assignment)
Due: Tuesday, March 22 (but late assignments accepted due to previous typo).