Abstract
We propose a nonlinear acoustic echo cancellation system, which aims to model the echo path from the far-end signal to the near-end microphone in two parts. Inspired by the physical behavior of modern hands-free devices, we first introduce a novel neural network architecture that is specifically designed to model the nonlinear distortions these devices induce between receiving and playing the far-end signal. To account for variations between devices, we construct this network with trainable memory length and nonlinear activation functions that are not parameterized in advance, but are rather optimized during the training stage using the training data. Second, the network is succeeded by a standard adaptive linear filter that constantly tracks the echo path between the loudspeaker output and the microphone. During training, the network and filter are jointly optimized to learn the network parameters. This system requires 17 thousand parameters that consume 500 Million floating-point operations per second and 40 Kilo-bytes of memory. It also satisfies hands-free communication timing requirements on a standard neural processor, which renders it adequate for embedding on hands-free communication devices. Using 280 hours of real and synthetic data, experiments show advantageous performance compared to competing methods.
Original language | English |
---|---|
Title of host publication | 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 |
Publisher | International Speech Communication Association |
Pages | 766-770 |
Number of pages | 5 |
ISBN (Electronic) | 9781713836902 |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Event | 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic Duration: 30 Aug 2021 → 3 Sep 2021 |
Publication series
Name | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
---|---|
Volume | 2 |
ISSN (Print) | 2308-457X |
ISSN (Electronic) | 1990-9772 |
Conference
Conference | 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 |
---|---|
Country/Territory | Czech Republic |
City | Brno |
Period | 30/08/21 → 3/09/21 |
Bibliographical note
Publisher Copyright:Copyright © 2021 ISCA.
Funding
This research was supported by the Pazy Research Foundation and ISF-NSFC joint research program (grant 2514/17). The authors thank Stem Audio for providing equipment and guidance.
Funders | Funder number |
---|---|
ISF-NSFC | 2514/17 |
Pazy Research Foundation |
Keywords
- Deep learning
- Hands-free communication
- Nonlinear acoustic echo cancellation
- On-device implementation