Go to| Kobus Barnard 's projects page | Kobus Barnard 's research page | Kobus Barnard 's Home Page


Multipart multi modal modeling and mining


An underlying theme distilled for much of the activities in Kobus's group is building statistical models for multimodal data. We are particularly interested in the case where documents have multiple components which explain multimodal observations and the linking (or correspondence) between the modes must be learned from the data. A canonical example is large image databases with associated text. In this domain images are be loosely associated with words, possibly keywords for searching, such as "water", "tiger", or "grass". Importantly, we generally do not have the correspondence between a particular image region (say the orange stripy part), and the word. Nonetheless, with approriate examples in the data set, the correspondence ambiguity can be resolved.

The key points regarding the above analysis is that we consider the data items to consist of multiple parts which jointly contribute to observations of multiple modalities. Approaching the data in this way allows the modes to be linked, thereby providing for (1) translation models which predict one mode from others; and (2) methods for integrating the modes for better data access and visualization.

We have investigated this approach to data modeling in depth in the case of the images associated with text. We are currently exploring similar approaches to a number of biological data sets. (Web pages detailing this are under construction).

Sub-projects: