Clustering of Compressed Illumination-Invariant Chromaticity Signatures
for Efficient Video Summarization

Mark S. Drew and James Au
School of Computing Science, Simon Fraser University,
Vancouver, B.C. Canada V5A 1S6
{mark, ksau}@cs.sfu.ca


TABLE OF CONTENTS

Full paper (in PDF format)

Abstract
Results
References


ABSTRACT

Motivated by colour constancy work in physics-based vision, we develop a new low-dimensional video frame feature that is effectively insensitive to lighting change and apply the feature to keyframe production using hierarchical clustering. The new image feature results from normalising colour channels for frames and then treating 2D histograms of chromaticity as images and compressing these. Because we effectively reduce any video to the same lighting conditions, we can precompute a universal basis on which to project video frame feature vectors. The new feature thus has the advantage of more expressively capturing essential colour information, and is useful for video indexing because it is very low-dimension -- the feature vector is only of length 8. We carry out clustering efficiently by adapting the hierarchical clustering data structure to temporally-ordered clusters. Using a new multi-stage hierarchical clustering method, we merge clusters based on the ratio of cluster variance to variance of the parent node, merging only adjacent clusters, and then follow with a second round of clustering. The second stage merges clusters incorrectly split in the first round by the greedy hierarchical algorithm, and as well merges non-adjacent clusters to fuse near-repeat shots. The new summarization method produces a very succinct set of keyframes for videos and, compared to a previous well-known technique, results are excellent.


RESULTS

In this document, we present results and compare our method with another algorithm, designed by Ferman and Tekalp, that is based on average color histogram and intersection histogram. [1]

For the following generated keyframes, "Correct" indicates correct (human-generated) results, "Signatures" indicates our method, and "HistInt" indicates method in [1], with k = 3 and Tc = 3000. Perfect transition detection (human-performed) was performed as a preprocessing step for the HistInt method.

Notice that the Signatures method generates a much more succinct summarization while missing very few keyframes. As well, it requires no transition detection.

15 Videos (over 8,000 frames)

Method Correct #Keyframes Redundant Missed
Signatures 81 78 13 16
HistInt 81 347 288 2

Basketball video  (basketball.mpg) - 897 frames, 0 min : 29 sec[2]

Method #Keyframes Generated Keyframes
Correct 10
Signatures 8
HistInt 132

Note: The HistInt method fails very badly because of sudden lighting changes and a changing background.

Football video  (football.mpg) - 560 frames, 0:30 [3]

Method #Keyframes Generated Keyframes
Correct 5
Signatures 5
HistInt 18

Child with Shadow  (child.mpg) - 30 frames, 0:01

Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 4

Aquarium (aqmov.mpg) - 801 frames, 0:53

Method #Keyframes Generated Keyframes
Correct 6
Signatures 4
HistInt 36

Beach  (beachmov.mpg) - 463 frames, 0:31

Method #Keyframes Generated Keyframes
Correct 4
Signatures 5
HistInt 24

Canada Day (canmov.mpg) - 480 frames, 0:32

Method #Keyframes Generated Keyframes
Correct 4
Signatures 3
HistInt 23

Capilano  (capmov.mpg) - 487 frames, 0:32

Method #Keyframes Generated Keyframes
Correct 5
Signatures 6
HistInt 12

Dragon Boat (dbmov.mpg) - 441 frames, 0:29

Method #Keyframes Generated Keyframes
Correct 8
Signatures 7
HistInt 24

Jazz  (jazzmov.mpg) - 361 frames, 0:24

Method #Keyframes Generated Keyframes
Correct 6
Signatures 4
HistInt 16

Professor  (prof.mpg) - 82 frames, 0:05

Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 6

Steam Clock (steam.mpg) - 405 frames, 0:27

Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 6

Walk with the Dragon (walkmov.mpg) - 497 frames, 0:33

Method #Keyframes Generated Keyframes
Correct 4
Signatures 4
HistInt 18

ABA repeated shot (aba.mpg) - 179 frames, 0:12

Method #Keyframes Generated Keyframes
Correct 2
Signatures 2
HistInt 10

 Nitobmov video (nitobmov.mpg) - 580 frames

Method #Keyframes Generated Keyframes
Correct 4
Signatures 6
HistInt 18

Simpsons video (simpson.mpg) - 2004 frames

Method #Keyframes Generated Keyframes
Correct 20
Signatures 20
HistInt 139

REFERENCES

  1. A.M. Ferman and A.M. Tekalp.  Efficient filtering and clustering methods for temporal video segmentation and visual summarization. J. Vis. Commun. & Image Rep., 9:336-351, 1998.
  2. Video obtained from University of Kensas. http://busboy.sped.ukans.edu/campus/movies/basketball/basketball.mpg
  3. Video obtained from University of Kensas. http://busboy.sped.ukans.edu/campus/movies/football/football.mpg