Maximum Likelihood Multi-Speaker Direction of Arrival Estimation Utilizing a Weighted Histogram

Elior Hadad, Sharon Gannot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In this contribution, a novel maximum likelihood (ML) based direction of arrival (DOA) estimator for concurrent speakers in a noisy reverberant environment is presented. The DOA estimation task is formulated in the short-time Fourier transform (STFT) in two stages. In the first stage, a single local DOA per time-frequency (TF) bin is selected, using the W-disjoint orthogonality property of the speech signal in the STFT domain. The local DOA is obtained as the maximum of the narrow-band likelihood localization spectrum at each TF bin. In addition, for each local DOA, a confidence measure is calculated, determining the confidence in the local estimate. In the second stage, the wide-band localization spectrum is calculated using a weighted histogram of the local DOA estimates with the confidence measures as weights. Finally, the wide-band DOA estimation is obtained by selecting the peaks in the wide-band localization spectrum. The results of our experimental study demonstrate the benefit of the proposed algorithm in a reverberant environment as compared with the classical steered response power phase transform (SRP-PHAT) algorithm.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages586-590
Number of pages5
ISBN (Electronic)9781509066315
DOIs
StatePublished - May 2020
Event2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
Duration: 4 May 20208 May 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
Country/TerritorySpain
CityBarcelona
Period4/05/208/05/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Fingerprint

Dive into the research topics of 'Maximum Likelihood Multi-Speaker Direction of Arrival Estimation Utilizing a Weighted Histogram'. Together they form a unique fingerprint.

Cite this