Speaker indexing in audio archives using test Utterance Gaussian mixture modeling

Hagai Aronowitz, David Burshtein, Amihood Amir

Research output: Contribution to conferencePaperpeer-review

15 Scopus citations

Abstract

Speaker Indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. The major reason for the drawbacks of existing solutions is the use of inaccurate anchor models. The contribution of this paper is two-fold. On the theoretical side, a new method is developed for simulating GMM scoring. This enables to fit a GMM not only to every target speaker but also to every test utterance, and then compute the likelihood of the test call using these GMMs instead of using the original data. The second contribution of this paper is in harnessing this GMM simulation to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE corpus show that our approach maintains the accuracy of the conventional GMM algorithm.

Original languageEnglish
Pages609-612
Number of pages4
StatePublished - 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 4 Oct 20048 Oct 2004

Conference

Conference8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island
Period4/10/048/10/04

Fingerprint

Dive into the research topics of 'Speaker indexing in audio archives using test Utterance Gaussian mixture modeling'. Together they form a unique fingerprint.

Cite this