A data-driven approach for multiple speakers localization in reverberant enclosures is presented. The approach combines semi-supervised learning on multiple manifolds with unsupervised maximum likelihood estimation. The relative transfer functions (RTFs) are used in both stages of the proposed algorithm as feature vectors, which are known to be related to source positions. The microphone positions are not known. In the training stage, a nonlinear, manifold-based, mapping between RTFs and source locations is inferred using single-speaker utterances. The inference procedure utilizes two RTF datasets: A small set of RTFs with their associated position labels; and a large set of unlabelled RTFs. This mapping is used to generate a dense grid of localized sources that serve as the centroids of a Mixture of Gaussians (MoG) model, used in the test stage of the algorithm to cluster RTFs extracted from multiple-speakers utterances. Clustering is applied by applying the expectation-maximization (EM) procedure that relies on the sparsity and intermittency of the speech signals. A preliminary experimental study, with either two or three overlapping speakers in various reverberation levels, demonstrates that the proposed scheme achieves high localization accuracy compared to a baseline method using a simpler propagation model.
|Title of host publication||28th European Signal Processing Conference, EUSIPCO 2020 - Proceedings|
|Publisher||European Signal Processing Conference, EUSIPCO|
|Number of pages||5|
|State||Published - 24 Jan 2021|
|Event||28th European Signal Processing Conference, EUSIPCO 2020 - Amsterdam, Netherlands|
Duration: 24 Aug 2020 → 28 Aug 2020
|Name||European Signal Processing Conference|
|Conference||28th European Signal Processing Conference, EUSIPCO 2020|
|Period||24/08/20 → 28/08/20|
Bibliographical noteFunding Information:
This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 871245; and the Israeli Innovation Authority through KAMIN Project No. 61916. Avital Bross is also funded by grant for advancement of woman in science and technology of the Israeli Ministry of Science and Technology.
© 2021 European Signal Processing Conference, EUSIPCO. All rights reserved.
- Mixture of Gaussians
- Semi-supervised inference