Abstract
In this contribution, a novel maximum likelihood (ML) based direction of arrival (DOA) estimator for concurrent speakers in a noisy reverberant environment is presented. The DOA estimation task is formulated in the short-time Fourier transform (STFT) in two stages. In the first stage, a single local DOA per time-frequency (TF) bin is selected, using the W-disjoint orthogonality property of the speech signal in the STFT domain. The local DOA is obtained as the maximum of the narrow-band likelihood localization spectrum at each TF bin. In addition, for each local DOA, a confidence measure is calculated, determining the confidence in the local estimate. In the second stage, the wide-band localization spectrum is calculated using a weighted histogram of the local DOA estimates with the confidence measures as weights. Finally, the wide-band DOA estimation is obtained by selecting the peaks in the wide-band localization spectrum. The results of our experimental study demonstrate the benefit of the proposed algorithm in a reverberant environment as compared with the classical steered response power phase transform (SRP-PHAT) algorithm.
| Original language | English |
|---|---|
| Title of host publication | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 586-590 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781509066315 |
| DOIs | |
| State | Published - May 2020 |
| Event | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain Duration: 4 May 2020 → 8 May 2020 |
Publication series
| Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
|---|---|
| Volume | 2020-May |
| ISSN (Print) | 1520-6149 |
Conference
| Conference | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 |
|---|---|
| Country/Territory | Spain |
| City | Barcelona |
| Period | 4/05/20 → 8/05/20 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
Fingerprint
Dive into the research topics of 'Maximum Likelihood Multi-Speaker Direction of Arrival Estimation Utilizing a Weighted Histogram'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver