Speech enhancement with mixture of deep experts with clean clustering pre-training

Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations

Abstract

In this study we present a mixture of deep experts (MoDE) neural-network architecture for single microphone speech enhancement. Our architecture comprises a set of deep neural networks (DNNs), each of which is an ‘expert’ in a different speech spectral pattern such as phoneme. A gating DNN is responsible for the latent variables which are the weights assigned to each expert’s output given a speech segment. The experts estimate a mask from the noisy input and the final mask is then obtained as a weighted average of the experts’ estimates, with the weights determined by the gating DNN. A soft spectral attenuation, based on the estimated mask, is then applied to enhance the noisy speech signal. As a byproduct, we gain reduction at the complexity in test time. We show that the experts specialization allows better robustness to unfamiliar noise types.1

Original languageEnglish
Pages (from-to)716-720
Number of pages5
JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Volume2021-June
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: 6 Jun 202111 Jun 2021

Bibliographical note

Publisher Copyright:
©2021 IEEE

Funding

1This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 871245 and was supported by the Ministry of Science & Technology, Israel.

FundersFunder number
Horizon 2020 Framework Programme871245
Ministry of science and technology, Israel

    Keywords

    • Clustering
    • Mixture of experts

    Fingerprint

    Dive into the research topics of 'Speech enhancement with mixture of deep experts with clean clustering pre-training'. Together they form a unique fingerprint.

    Cite this