TY - JOUR
T1 - Single microphone speech separation by diffusion-based HMM estimation
AU - Yeminy, Yochay R.
AU - Keller, Yosi
AU - Gannot, Sharon
N1 - Publisher Copyright:
© 2016, The Author(s).
PY - 2016/12/1
Y1 - 2016/12/1
N2 - We present a novel non-iterative and rigorously motivated approach for estimating hidden Markov models (HMMs) and factorial hidden Markov models (FHMMs) of high-dimensional signals. Our approach utilizes the asymptotic properties of a spectral, graph-based approach for dimensionality reduction and manifold learning, namely the diffusion framework. We exemplify our approach by applying it to the problem of single microphone speech separation, where the log-spectra of two unmixed speakers are modeled as HMMs, while their mixture is modeled as an FHMM. We derive two diffusion-based FHMM estimation schemes. One of which is experimentally shown to provide separation results that compare with contemporary speech separation approaches based on HMM. The second scheme allows a reduced computational burden.
AB - We present a novel non-iterative and rigorously motivated approach for estimating hidden Markov models (HMMs) and factorial hidden Markov models (FHMMs) of high-dimensional signals. Our approach utilizes the asymptotic properties of a spectral, graph-based approach for dimensionality reduction and manifold learning, namely the diffusion framework. We exemplify our approach by applying it to the problem of single microphone speech separation, where the log-spectra of two unmixed speakers are modeled as HMMs, while their mixture is modeled as an FHMM. We derive two diffusion-based FHMM estimation schemes. One of which is experimentally shown to provide separation results that compare with contemporary speech separation approaches based on HMM. The second scheme allows a reduced computational burden.
KW - Diffusion maps
KW - Factorial hidden Markov models
KW - Manifold learning
KW - Single microphone speech separation
UR - http://www.scopus.com/inward/record.url?scp=84991619628&partnerID=8YFLogxK
U2 - 10.1186/s13636-016-0094-9
DO - 10.1186/s13636-016-0094-9
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84991619628
SN - 1687-4714
VL - 2016
JO - Eurasip Journal on Audio, Speech, and Music Processing
JF - Eurasip Journal on Audio, Speech, and Music Processing
IS - 1
M1 - 16
ER -