Video Keyframe Production by Efficient Clustering of Compressed Chromaticity Signatures

Mark S. Drew and James Au
School of Computing Science, Simon Fraser University,
Vancouver, B.C. Canada V5A 1S6
{mark, ksau}@cs.sfu.ca

Table of Contents


    Full text (longer version) [.ps.gz]

Abstract
Results
References


Abstract

We develop a new low-dimensional video frame feature that is more insensitive to lighting change, motivated by color constancy work in physics-based vision, and apply the feature to keyframe production using hierarchical clustering. The new feature has the further advantage of more expressively capturing image information and as a result produces a very succinct set of keyframes for any video. Because we effectively reduce any video to the same lighting conditions, we can produce a universal basis<\IT> on which to project video frame features. We carry out clustering efficiently by adapting the hierarchical clustering data structure to temporally-ordered clusters. Using a new multi-stage hierarchical clustering method, we merge clusters based on the ratio of cluster variance to variance of the parent node, merging only adjacent clusters, and then follow with a second round of clustering. The second stage merges clusters incorrectly split in the first round by the greedy hierarchical algorithm, and as well merges non-adjacent clusters to fuse near-repeat shots. The new summarization method produces a very succinct set of keyframes for videos, and results are excellent. 

Results

In this document, we present results and compare our method with another algorithm, designed by Ferman and Tekalp, that is based on average color histogram and intersection histogram. [1]

For the following generated keyframes, "Correct" indicates correct (human-generated) results, "Signatures" indicates our method, and "HistInt" indicates method in [1], with k = 3 and Tc = 3000. Perfect transition detection (human-performed) was performed as a preprocessing step for the HistInt method.

Notice that the Signatures method generates a much more succinct summarization while missing very few keyframes. As well, it requires no transition detection.

14 Videos (over 10,000 frames)
 
Method Correct #Keyframes Redundant Missed
Signatures 61 60 9 10
HistInt 61 347 288 2

Basketball video  (basketball.mov) - 897 frames, 0 min : 29 sec[2]
 
Method #Keyframes Generated Keyframes
Correct 10
Signatures 10
HistInt 132

Note: The HistInt method fails very badly because of sudden lighting changes and a changing background.

Football video  (football.mpg) - 560 frames, 0:30 [3]
 
Method #Keyframes Generated Keyframes
Correct 5
Signatures 5
HistInt 18

Child with Shadow  (child.mpg) - 30 frames, 0:01
 
Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 4

Aquarium (aqmov.mpg) - 801 frames, 0:53
 
Method #Keyframes Generated Keyframes
Correct 6
Signatures 4
HistInt 36

Beach  (beachmov.mpg) - 463 frames, 0:31
 
Method #Keyframes Generated Keyframes
Correct 4
Signatures 3
HistInt 24

Canada Day (canmov.mpg) - 480 frames, 0:32
 
Method #Keyframes Generated Keyframes
Correct 4
Signatures 3
HistInt 23

Capilano  (capmov.mpg) - 487 frames, 0:32
 
Method #Keyframes Generated Keyframes
Correct 5
Signatures 5
HistInt 12

Dragon Boat (dbmov.mpg) - 441 frames, 0:29
 
Method #Keyframes Generated Keyframes
Correct 8
Signatures 8
HistInt 24

Jazz  (jazzmov.mpg) - 361 frames, 0:24
 
Method #Keyframes Generated Keyframes
Correct 6
Signatures 5
HistInt 16

Professor  (prof.mpg) - 82 frames, 0:05
 
Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 6

Steam Clock (steam.mpg) - 405 frames, 0:27
 
Method #Keyframes Generated Keyframes
Correct 1
Signatures 1
HistInt 6

Walk with the Dragon (walkmov.mpg) - 497 frames, 0:33
 
Method #Keyframes Generated Keyframes
Correct 4
Signatures 5
HistInt 18

ABA repeated shot (aba.mpg) - 179 frames, 0:12
 
Method #Keyframes Generated Keyframes
Correct 2
Signatures 2
HistInt 10

 Nitobmov video (nitobmov.mpg) - 580 frames
 
Method #Keyframes Generated Keyframes
Correct 4
Signatures 7
HistInt 18


References

  1. A.M. Ferman and A.M. Tekalp.  Efficient filtering and clustering methods for temporal video segmentation and visual summarization. J. Vis. Commun. & Image Rep., 9:336-351, 1998.
  2. Video obtained from University of Kensas. http://busboy.sped.ukans.edu/campus/movies/basketball/basketball.mov
    (Local copy)
  3. Video obtained from University of Kensas. http://busboy.sped.ukans.edu/campus/movies/football/football.mpg
    (Local copy)