Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training

Ayal Schwartz, Elior Hadad, Sharon Gannot, Shlomo E. Chazan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Deep direction of arrival (DOA) models commonly require a perfect match between the array configurations in the training and test stages and consequently cannot be applied to unfamiliar microphone array constellations. In this paper, we present a deep DOA estimation method that circumvents this requirement. In our approach, we first cast the DOA estimation as a classification problem in each time-frequency (TF) bin, thus facilitating the localization of multiple concurrent speakers. We utilize a high-resolution spatial image, based on a narrow-band variant of the steered response power phase transform (SRP-PHAT) processor, as an input feature. The model is trained with simulated data using a single microphone array configuration in various acoustic conditions. In the test stage, the algorithm is applied with unfamiliar microphone array constellations, namely with a different number of microphones and inter-distances. An elaborated experimental study with real-life room impulse response (RIR) recordings demonstrates the effectiveness of the proposed input feature and the training scheme. Our approach achieves comparable results in familiar microphone array constellations and, more importantly, can accurately estimate the DOA of multiple concurrent speakers even with unfamiliar microphone arrays.

Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350323726
DOIs
StatePublished - 2023
Event2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023 - New Paltz, United States
Duration: 22 Oct 202325 Oct 2023

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Volume2023-October
ISSN (Print)1931-1168
ISSN (Electronic)1947-1629

Conference

Conference2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
Country/TerritoryUnited States
CityNew Paltz
Period22/10/2325/10/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Funding

The project has received funding from the European Union's Horizon 2020 Research and Innovation Programme, Grant Agreement No. 871245; and was also supported by the Israeli Ministry of Science & Technology. ∗The project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme, Grant Agreement No. 871245; and was also supported by the Israeli Ministry of Science & Technology.

FundersFunder number
Horizon 2020 Framework Programme
Ministry of science and technology, Israel
Horizon 2020871245

    Keywords

    • Deep DOA
    • SRP-PHAT

    Fingerprint

    Dive into the research topics of 'Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training'. Together they form a unique fingerprint.

    Cite this