TY - JOUR
T1 - Scaled random trajectory segment models
AU - Goldberger, Jacob
AU - Burshtein, David
PY - 1998/1
Y1 - 1998/1
N2 - Speech recognition systems that are based on hidden Markov modelling (HMM) assume that the mean trajectory feature vector within a state is constant over time. In recent years, segment models that attempt to describe the dynamics of the speech signal within a phonetic unit have been proposed. Some of these models describe the mean trajectory over time as a random process. In this paper we present the concept of a scaled random trajectory segment model, which aims to overcome the modelling problem created by the fact that segment realizations of the same phonetic unit differ in length. The new model is supported by direct experimental evidence. It offers the following advantages over the standard (non-scaled) model. First, it shows improved performance compared to the non-scaled model. This is demonstrated using phone classification experiments. Second, it yields closed form expressions for the estimated parameters, unlike the previously suggested, non-scaled model, which requires more complicated iterative estimation procedures.
AB - Speech recognition systems that are based on hidden Markov modelling (HMM) assume that the mean trajectory feature vector within a state is constant over time. In recent years, segment models that attempt to describe the dynamics of the speech signal within a phonetic unit have been proposed. Some of these models describe the mean trajectory over time as a random process. In this paper we present the concept of a scaled random trajectory segment model, which aims to overcome the modelling problem created by the fact that segment realizations of the same phonetic unit differ in length. The new model is supported by direct experimental evidence. It offers the following advantages over the standard (non-scaled) model. First, it shows improved performance compared to the non-scaled model. This is demonstrated using phone classification experiments. Second, it yields closed form expressions for the estimated parameters, unlike the previously suggested, non-scaled model, which requires more complicated iterative estimation procedures.
UR - http://www.scopus.com/inward/record.url?scp=0031707628&partnerID=8YFLogxK
U2 - 10.1006/csla.1997.0035
DO - 10.1006/csla.1997.0035
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0031707628
SN - 0885-2308
VL - 12
SP - 51
EP - 73
JO - Computer Speech and Language
JF - Computer Speech and Language
IS - 1
ER -