Abstract
Test utterance parameterization (TUP) using Gaussian mixture models (GMMs) has recently been shown to be beneficial for speaker indexing due to its computational efficiency and identical accuracy compared to classic GMM-based recognizers. We show that TUP can also lead to more accurate speaker recognition. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 8% compared to the classic GMM-based algorithm. Furthermore, we introduce a novel generative statistical model for generation of test utterances by speakers. This model is incorporated naturally into the TUP framework and improves speaker recognition accuracy. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 15% compared to the classic GMM-based algorithm.
Original language | American English |
---|---|
Title of host publication | Acoustics, Speech, and Signal Processing, 2005. Proceedings.(ICASSP'05). IEEE International Conference on |
Publisher | IEEE |
State | Published - 2005 |