TY - GEN
T1 - A bayesian hierarchical mixture of gaussian model for multi-speaker DOA estimation and separation
AU - Laufer, Yaron
AU - Gannot, Sharon
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/9
Y1 - 2020/9
N2 - In this paper we propose a fully Bayesian hierarchical model for multi-speaker direction of arrival (DoA) estimation and separation in noisy environments, utilizing the W-disjoint orthogonality property of the speech sources. Our probabilistic approach employs a mixture of Gaussians formulation with centroids associated with a grid of candidate speakers' DoAs. The hierarchical Bayesian model is established by attributing priors to the various parameters. We then derive a variational Expectation-Maximization algorithm that estimates the DoAs by selecting the most probable candidates, and separates the speakers using a variant of the multichannel Wiener filter that takes into account the responsibility of each candidate in describing the received data. The proposed algorithm is evaluated using real room impulse responses from a freely-available database, in terms of both DoA estimates accuracy and separation scores. It is shown that the proposed method outperforms competing methods.
AB - In this paper we propose a fully Bayesian hierarchical model for multi-speaker direction of arrival (DoA) estimation and separation in noisy environments, utilizing the W-disjoint orthogonality property of the speech sources. Our probabilistic approach employs a mixture of Gaussians formulation with centroids associated with a grid of candidate speakers' DoAs. The hierarchical Bayesian model is established by attributing priors to the various parameters. We then derive a variational Expectation-Maximization algorithm that estimates the DoAs by selecting the most probable candidates, and separates the speakers using a variant of the multichannel Wiener filter that takes into account the responsibility of each candidate in describing the received data. The proposed algorithm is evaluated using real room impulse responses from a freely-available database, in terms of both DoA estimates accuracy and separation scores. It is shown that the proposed method outperforms competing methods.
KW - Audio source separation
KW - DoA estimation
KW - Mixture of Gaussians
KW - Variational EM
KW - W-disjoint orthogonality
UR - http://www.scopus.com/inward/record.url?scp=85096497695&partnerID=8YFLogxK
U2 - 10.1109/mlsp49062.2020.9231852
DO - 10.1109/mlsp49062.2020.9231852
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85096497695
T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP
BT - Proceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing, MLSP 2020
PB - IEEE Computer Society
T2 - 30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020
Y2 - 21 September 2020 through 24 September 2020
ER -