Abstract
Deep neural networks (DNNs) have recently became a viable methodology for single microphone speech enhancement. The most common approach, is to feed the noisy speech features into a fully-connected DNN to either directly enhance the speech signal or to infer a mask which can be used for the speech enhancement. In this case, one network has to deal with the large variability of the speech signal. Most approaches also discard the speech continuity. In this paper, we propose a deep recurrent mixture of experts (DRMoE) architecture that addresses these two issues. In order to reduce the large speech variability, we split the network into a mixture of networks (denoted experts), each of which specializes in a specific and simpler task and a gating network. The time-continuity of the speech signal is taken into account by implementing the experts and the gating network as a recurrent neural network (RNN). Experimental study shows that the proposed algorithm produces higher objective measurements scores compared to both a single RNN and a deep mixture of experts (DMoE) architectures.
Original language | English |
---|---|
Title of host publication | 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 359-363 |
Number of pages | 5 |
ISBN (Electronic) | 9781538616321 |
DOIs | |
State | Published - 7 Dec 2017 |
Event | 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017 - New Paltz, United States Duration: 15 Oct 2017 → 18 Oct 2017 |
Publication series
Name | IEEE Workshop on Applications of Signal Processing to Audio and Acoustics |
---|---|
Volume | 2017-October |
Conference
Conference | 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017 |
---|---|
Country/Territory | United States |
City | New Paltz |
Period | 15/10/17 → 18/10/17 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- long short-Term memory
- recurrent neural network
- speech presence probability