Joint maximum likelihood estimation of late reverberant and speech power spectral density in noisy environments

Ofer Schwartz, Sharon Gannot, Emanuel A.P. Habets

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Scopus citations

Abstract

An estimate of the power spectral density (PSD) of the late reverberation is often required by dereverberation algorithms. In this work, we derive a novel multichannel maximum likelihood (ML) estimator for the PSD of the reverberation that can be applied in noisy environments. Since the anechoic speech PSD is usually unknown in advance, it is estimated as well. As a closed-form solution for the maximum likelihood estimator is unavailable, a Newton method for maximizing the ML criterion is derived. Experimental results show that the proposed estimator provides an accurate estimate of the PSD, and outperforms competing estimators. Moreover, when used in a multi-microphone dereverberation and noise reduction algorithm, the best performance in terms of the log-spectral distance is achieved when employing the proposed PSD estimator.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages151-155
Number of pages5
ISBN (Electronic)9781479999880
DOIs
StatePublished - 18 May 2016
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 20 Mar 201625 Mar 2016

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Conference

Conference41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Country/TerritoryChina
CityShanghai
Period20/03/1625/03/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Fingerprint

Dive into the research topics of 'Joint maximum likelihood estimation of late reverberant and speech power spectral density in noisy environments'. Together they form a unique fingerprint.

Cite this