A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models

Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

The problem of speaker tracking in noisy and reverberant enclosures is addressed in this paper. We present a hybrid algorithm, combining traditional tracking schemes with a new learning-based approach. A state-space representation, consisting of a propagation and observation models, is learned from signals measured by several distributed microphone pairs. The proposed representation is based on two data modalities corresponding to high-dimensional acoustic features representing the full reverberant acoustic channels as well as low-dimensional time difference of arrival (TDOA) estimates. The state-space representation is accompanied by a statistical model based on a Gaussian process used to relate the variations of the acoustic channels to the physical variations of the associated source positions, thereby forming a data-driven propagation model for the source movement. In the observation model, the source positions are nonlinearly mapped to the associated TDOA readings. The obtained propagation and observation models establish the basis for employing an extended Kalman filter. The simulation results demonstrate the robustness of the proposed method in noisy and reverberant conditions.

Original languageEnglish
Article number8248766
Pages (from-to)725-735
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume26
Issue number4
DOIs
StatePublished - Apr 2018

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Funding

Manuscript received May 20, 2017; revised December 5, 2017 and December 25, 2017; accepted December 28, 2017. Date of publication January 8, 2018; date of current version February 1, 2018. This work was supported by a grant from a joint Lower Saxony-Israeli Project financially supported by the State of Lower Saxony. The work of B. Laufer-Goldshtein was supported by the Adams Foundation of the Israel Academy of Sciences and Humanities. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Augusto Sarti. (Corresponding author: Sharon Gannot.) B. Laufer-Goldshtein and S. Gannot are with the Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel (e-mail: Bracha.Laufer@biu. ac.il; [email protected]).

FundersFunder number
Adams Foundation of the Israel Academy of Sciences and Humanities
State of Lower Saxony

    Keywords

    • Gaussian process
    • Speaker tracking
    • extended Kalman filter (EKF)
    • relative transfer function (RTF)
    • time difference of arrival (TDOA)

    Fingerprint

    Dive into the research topics of 'A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models'. Together they form a unique fingerprint.

    Cite this