A probabilistic framework for spatio-temporal video representation & indexing

Hayit Greenspan, Jacob Goldberger, Arnaldo Mayer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

34 Scopus citations

Abstract

In this work we describe a novel statistical video representation and modeling scheme. Video representation schemes are needed to enable segmenting a video stream into meaningful video-objects, useful for later indexing and retrieval applications. In the proposed methodology, unsupervised clustering via Guassian mixture modeling extracts coherent space-time regions in feature space, and corresponding coherent segments (video-regions) in the video content. A key feature of the system is the analysis of video input as a single entity as opposed to a sequence of separate frames. Space and time are treated uniformly. The extracted space-time regions allow for the detection and recognition of video events. Results of segmenting video content into static vs. dynamic video regions and video content editing are presented.

Original languageEnglish
Title of host publicationComputer Vision - ECCV 2002 - 7th European Conference on Computer Vision, Proceedings
EditorsAnders Heyden, Gunnar Sparr, Mads Nielsen, Peter Johansen
PublisherSpringer Verlag
Pages461-475
Number of pages15
ISBN (Electronic)9783540437482
DOIs
StatePublished - 2002
Externally publishedYes
Event7th European Conference on Computer Vision, ECCV 2002 - Copenhagen, Denmark
Duration: 28 May 200231 May 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2353
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th European Conference on Computer Vision, ECCV 2002
Country/TerritoryDenmark
CityCopenhagen
Period28/05/0231/05/02

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2002

Fingerprint

Dive into the research topics of 'A probabilistic framework for spatio-temporal video representation & indexing'. Together they form a unique fingerprint.

Cite this