Time difference of arrival estimation of speech source in a noisy and reverberant environment

Tsvi G. Dvorkind, Sharon Gannot

    Research output: Contribution to journalArticlepeer-review

    116 Scopus citations

    Abstract

    Determining the spatial position of a speaker finds a growing interest in video conference scenarios where automated camera steering and tracking are required. Speaker localization can be achieved with a dual-step approach. In the preliminary stage a microphone array is used to extract the time difference of arrival (TDOA) of the speech signal. These readings are then used by the second stage for the actual localization. In this work we present novel, frequency domain, approaches for TDOA calculation in a reverberant and noisy environment. Our methods are based on the speech quasi-stationarity property, noise stationarity and on the fact that the speech and the noise are uncorrelated. The mathematical derivations in this work are followed by an extensive experimental study which involves static and tracking scenarios.

    Original languageEnglish
    Pages (from-to)177-204
    Number of pages28
    JournalSignal Processing
    Volume85
    Issue number1
    DOIs
    StatePublished - Jan 2005

    Keywords

    • Decorrelation
    • Non-stationarity
    • Source localization
    • TDOA

    Fingerprint

    Dive into the research topics of 'Time difference of arrival estimation of speech source in a noisy and reverberant environment'. Together they form a unique fingerprint.

    Cite this