A bayesian hierarchical mixture of gaussian model for multi-speaker DOA estimation and separation

Yaron Laufer, Sharon Gannot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In this paper we propose a fully Bayesian hierarchical model for multi-speaker direction of arrival (DoA) estimation and separation in noisy environments, utilizing the W-disjoint orthogonality property of the speech sources. Our probabilistic approach employs a mixture of Gaussians formulation with centroids associated with a grid of candidate speakers' DoAs. The hierarchical Bayesian model is established by attributing priors to the various parameters. We then derive a variational Expectation-Maximization algorithm that estimates the DoAs by selecting the most probable candidates, and separates the speakers using a variant of the multichannel Wiener filter that takes into account the responsibility of each candidate in describing the received data. The proposed algorithm is evaluated using real room impulse responses from a freely-available database, in terms of both DoA estimates accuracy and separation scores. It is shown that the proposed method outperforms competing methods.

Original languageEnglish
Title of host publicationProceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing, MLSP 2020
PublisherIEEE Computer Society
ISBN (Electronic)9781728166629
DOIs
StatePublished - Sep 2020
Event30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020 - Virtual, Espoo, Finland
Duration: 21 Sep 202024 Sep 2020

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2020-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020
Country/TerritoryFinland
CityVirtual, Espoo
Period21/09/2024/09/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Funding

FundersFunder number
Horizon 2020 Framework Programme871245

    Keywords

    • Audio source separation
    • DoA estimation
    • Mixture of Gaussians
    • Variational EM
    • W-disjoint orthogonality

    Fingerprint

    Dive into the research topics of 'A bayesian hierarchical mixture of gaussian model for multi-speaker DOA estimation and separation'. Together they form a unique fingerprint.

    Cite this