TY - GEN
T1 - Direct Error Rate Minimization of Hidden Markov Models
AU - Keshet, J.
AU - Cheng, C. C
AU - Stoehr, M
AU - McAllester, D. A
AU - Saul, L. K
N1 - Place of conference:Italy
PY - 2011
Y1 - 2011
N2 - We explore discriminative training of HMM parameters that directly
minimizes the expected error rate. In discriminative training
one is interested in training a system to minimize a desired
error function, like word error rate, phone error rate, or frame
error rate. We review a recent method (McAllester, Hazan and
Keshet, 2010), which introduces an analytic expression for the
gradient of the expected error-rate. The analytic expression
leads to a perceptron-like update rule, which is adapted here
for training of HMMs in an online fashion. While the proposed
method can work with any type of the error function used in
speech recognition, we evaluated it on phoneme recognition of
TIMIT, when the desired error function used for training was
frame error rate. Except for the case of GMM with a single
mixture per state, the proposed update rule provides lower error
rates, both in terms of frame error rate and phone error rate,
than other approaches, including MCE and large margin.
Index Terms: hidden Markov models, online learning, direct
error minimization, discriminative training, automatic speech
recognition, minimum phone error, minimum frame error
AB - We explore discriminative training of HMM parameters that directly
minimizes the expected error rate. In discriminative training
one is interested in training a system to minimize a desired
error function, like word error rate, phone error rate, or frame
error rate. We review a recent method (McAllester, Hazan and
Keshet, 2010), which introduces an analytic expression for the
gradient of the expected error-rate. The analytic expression
leads to a perceptron-like update rule, which is adapted here
for training of HMMs in an online fashion. While the proposed
method can work with any type of the error function used in
speech recognition, we evaluated it on phoneme recognition of
TIMIT, when the desired error function used for training was
frame error rate. Except for the case of GMM with a single
mixture per state, the proposed update rule provides lower error
rates, both in terms of frame error rate and phone error rate,
than other approaches, including MCE and large margin.
Index Terms: hidden Markov models, online learning, direct
error minimization, discriminative training, automatic speech
recognition, minimum phone error, minimum frame error
UR - https://scholar.google.co.il/scholar?q=Direct+Error+Rate+Minimization+of+Hidden+Markov+Models&btnG=&hl=en&as_sdt=0%2C5
M3 - Conference contribution
BT - INTERSPEECH
ER -