Abstract
We describe a discriminative algorithm for automatic VOT measurement,
considered as an application of predicting structured
output from speech. In contrast to previous studies which use
customized rules, in our approach a function is trained on manually
labeled examples, using an online algorithm to predict the
burst and voicing onsets (and hence VOT). The feature set used
is customized for detecting the burst and voicing onsets, and the
loss function used in training is the difference between predicted
and actual VOT. Applied to initial voiceless stops from two corpora,
the algorithm compares favorably to previous work, and
the agreement between automatic and manual measurements is
near human inter-judge reliability
Original language | American English |
---|---|
Title of host publication | INTERSPEECH |
State | Published - 2010 |