Abstract
This paper is motivated by the fact that text dependent
speaker recognition is inherently more accurate than text
independent speaker recognition. In this work we assign
models to frequent words spoken by a speaker and spot them
in a test call. In this way, text-dependent speaker recognition
technology can be used for text independent tasks. The
approach we take is to use DTW (Dynamic Time Warp) word
spotting to find words in the test that resemble words in the
train set. Results on the SPIDRE corpus show that using a
combined DTW spotter based system and a GMM system
improves performance significantly. For very low false
acceptance rate (0.1%) misdetection was reduced from 32.2%
to 23.3% (28% reduction). For low false acceptance rate (1%)
misdetection was reduced from 28.9% to 21.1% (27%
reduction).
Original language | American English |
---|---|
Title of host publication | 8th International Conference on Spoken Language Processing (INTERSPEECH 2004 - ICSLP) |
State | Published - 2004 |