get_GMM_blk_compound_sym_cov_1 - This routine is the same as the get_GMM_blk_compound_sym_cov with the added
Example compile flags (system dependent):
-DLINUX_X86_64 -DLINUX_X86_64_OPTERON -DGNU_COMPILER
-lKJB -lfftw3 -lgsl -lgslcblas -ljpeg -lSVM -lstdc++ -lpthread -lSLATEC -lg2c -lacml -lacml_mv -lblas -lg2c -lncursesw
const Int_vector_vector *block_diag_sizes_vvp,
const Matrix *feature_mp,
const Int_vector *held_out_indicator_vp,
const Vector *initial_a_vp,
const double initial_mu,
const double initial_sig_sqr,
const double initial_tau_sig_sqr_ratio,
feature of estimating the tau_sig_sqr_ratio parameter automatically through
gradient ascent of the expected complete log likelihood in the M step.
Finds a Gaussian mixture model (GMM) where the Gaussians have block compound
symmetrical covariances with shared parameters. Specifically, each feature
has the same mean (mu) and variance (sig^2 + tau^2). The covariance
between any pair of features is either (tau^2) or 0. This holds for all the
Gaussians in the mixture. However the block diagonal structure of the
covariance matrices of the different Gaussians in the mixture is different.
Here it is not assumed that the ratio (tau^2 / sig^2) is known beforehand
unlike the parent routine get_GMM_blk_compound_sym_cov.
In particular, it fits:
p(x) = sum a-sub-i * g(mu-vec, cov-sub-i, x)
where a-sub-i is the prior probability for the mixuture compoenent (cluster),
mu-vec is the mean vector with all components equal to mu, cov-sub-i is the
covariance matrix for the component i, and g(mu,cov,x) is a Gaussian with
mean mu and covariance cov.
The data matrix feature_mp is an N by M matrix where N is the number of data
points, and M is the number of features.
The argument block_diag_sizes_vvp specifies the block diagonal structures of
the covariances of the Gaussian components. The number of vectors in this
argument is equal to the number of mixture compoenents (clusters), K. Each
vector is a list of sizes of the block diagonals from top to bottom in the
corresponding covariance matrix. For eg., the vector corresponding to a
Gaussian component with all independent features would consist of 1 as all
its elements and the number of elements in the vector equal to M. And a
Gaussian component with two block diagonals of the same size in its
covariance matrix would be specified by a vector with two elements, each
equal to M/2.
initial_a_vp, initial_mu, initial_sig_sqr and initial_tau_sig_sqr_ratio can
be used to specify the initial values of the parameters for EM.
The model parameters are put into *a_vpp, *mu_ptr, *sig_sqr_ptr and
*tau_sig_sqr_ratio_ptr. Any of a_vpp, mu_ptr, sig_sqr_ptr or
tau_sig_sqr_ratio_ptr is NULL if that value is not needed.
If P_mpp is not NULL, then the soft clustering (cluster membership) for each
data point is returned. In that case, *P_mpp will be N by K.
If the routine fails (due to storage allocation), then ERROR is returned
with an error message being set. Otherwise NO_ERROR is returned.
The covariance structure assumed in this routine is often referred to
as "block compound-symmetry" structure especially in the mixed-models ANOVA
literature. It is useful in modeling data with repeated measures in ANOVA
using mixed-models. For example see:
This software is not adequatedly tested. It is recomended that
results are checked independantly where appropriate.
Prasad Gabbur, Kobus Barnard.