An Expectation-Maximization Algorithm for Multimicrophone Speech Dereverberation and Noise Reduction with Coherence Matrix Estimation

Ofer Schwartz, Sharon Gannot, Emanuël A.P. Habets

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

In speech communication systems, the microphone signals are degraded by reverberation and ambient noise. The reverberant speech can be separated into two components, namely, an early speech component that consists of the direct path and some early reflections and a late reverberant component that consists of all late reflections. In this paper, a novel algorithm to simultaneously suppress early reflections, late reverberation, and ambient noise is presented. The expectation-maximization (EM) algorithm is used to estimate the signals and spatial parameters of the early speech component and the late reverberation components. As a result, a spatially filtered version of the early speech component is estimated in the E-step. The power spectral density (PSD) of the anechoic speech, the relative early transfer functions, and the PSD matrix of the late reverberation are estimated in the M-step of the EM algorithm. The algorithm is evaluated using real room impulse response recorded in our acoustic lab with a reverberation time set to 0.36 s and 0.61 s and several signal-to-noise ratio levels. It is shown that significant improvement is obtained and that the proposed algorithm outperforms baseline single-channel and multichannel dereverberation algorithms, as well as a state-of-the-art multichannel dereverberation algorithm.

Original languageEnglish
Pages (from-to)1495-1510
Number of pages16
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume24
Issue number9
DOIs
StatePublished - Sep 2016

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Keywords

  • dereverberation
  • expectationmaximization
  • noise reduction

Fingerprint

Dive into the research topics of 'An Expectation-Maximization Algorithm for Multimicrophone Speech Dereverberation and Noise Reduction with Coherence Matrix Estimation'. Together they form a unique fingerprint.

Cite this