The UNIV'07 data set

The UNIV data set contains data to accompany the paper, "Robust Spatio-temporal Matching of Electronic Slides to Presentation Videos," by Quanfu Fan, Kobus Barnard, Arnon Amir, and Alon Efrat. If you use this data for published research, please reference this paper. [ PDF (pre-print)]

This research is part of larger project, Semantically Linked Instructional Content (SLIC).

Terms of use:

Use of this data is limited to academic research in video processing. The copyright for each of the presentations and the slide images remain with the presenters (specifically, Mark Chaves, Wayne Coates, and Jerzy Rozenblit).

Expected utility:

This data is suited to the problem of matching video frames of presentations to images of the presentation slides (e.g. PPT) used, as well as applications of doing so to indexing video, improving its quality, and low bandwidth browsing.

Format:

The data is distributed as a simple gzipped tar ball.

What the data contains:

The data set is divided into three talks from the U. Arizona distinguished faculty seminar series in sub-directories labeled by the presenters last name in lower case (chaves, coates, rozenblit). Each sub-directory contains the video, the sampled frame that we used in the above paper (including keyframes), slide images, a list of the frames that are keyframes, and a ground truth file. The format of the ground truth file is four columns for frame number, slide number, slide type, and camera status.

Slide type is encoded as follows:

 -1   NO_SLIDE = -1;
  1   FULL_SLIDE = 1;
  2   SMALL_SLIDE = 2;
-99   MISSING_SLIDE (slide was used that is not in the slide file)

Camera status is encoded as follows:

  0   ZOOM_IN
  1   STAY_FIXED
  2   ZOOM_OUT
  3   PAN_TILT
  4   SLIDE_CUT1 (from one specific camera to the other)
  5   SLIDE_CUT2 (reverse of SLIDE_CUT1) 
  6   SLIDE_IN
  7   STAY_OUT
  8   SLIDE_OUT

Getting the data:

Download by clicking on this link (1456 MB).