Acoustic Echo Cancellation by Combining Adaptive Digital Filter and
Recurrent Neural Network
- URL: http://arxiv.org/abs/2005.09237v1
- Date: Tue, 19 May 2020 06:25:52 GMT
- Title: Acoustic Echo Cancellation by Combining Adaptive Digital Filter and
Recurrent Neural Network
- Authors: Lu Ma, Hua Huang, Pei Zhao, Tengrong Su
- Abstract summary: A fusion scheme by combining adaptive filter and neural network is proposed for Acoustic Echo Cancellation.
The echo could be reduced in a large scale by adaptive filtering, resulting in little residual echo.
The neural network is elaborately designed and trained for suppressing such residual echo.
- Score: 11.335343110341354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Acoustic Echo Cancellation (AEC) plays a key role in voice interaction. Due
to the explicit mathematical principle and intelligent nature to accommodate
conditions, adaptive filters with different types of implementations are always
used for AEC, giving considerable performance. However, there would be some
kinds of residual echo in the results, including linear residue introduced by
mismatching between estimation and the reality and non-linear residue mostly
caused by non-linear components on the audio devices. The linear residue can be
reduced with elaborate structure and methods, leaving the non-linear residue
intractable for suppression. Though, some non-linear processing methods have
already be raised, they are complicated and inefficient for suppression, and
would bring damage to the speech audio. In this paper, a fusion scheme by
combining adaptive filter and neural network is proposed for AEC. The echo
could be reduced in a large scale by adaptive filtering, resulting in little
residual echo. Though it is much smaller than speech audio, it could also be
perceived by human ear and would make communication annoy. The neural network
is elaborately designed and trained for suppressing such residual echo.
Experiments compared with prevailing methods are conducted, validating the
effectiveness and superiority of the proposed combination scheme.
Related papers
- UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit
Normalization [60.43992089087448]
Dysarthric speech reconstruction systems aim to automatically convert dysarthric speech into normal-sounding speech.
We propose a Unit-DSR system, which harnesses the powerful domain-adaptation capacity of HuBERT for training efficiency improvement.
Compared with NED approaches, the Unit-DSR system only consists of a speech unit normalizer and a Unit HiFi-GAN vocoder, which is considerably simpler without cascaded sub-modules or auxiliary tasks.
arXiv Detail & Related papers (2024-01-26T06:08:47Z) - Diffusion Conditional Expectation Model for Efficient and Robust Target
Speech Extraction [73.43534824551236]
We propose an efficient generative approach named Conditional Diffusion Expectation Model (DCEM) for Target Speech Extraction (TSE)
It can handle multi- and single-speaker scenarios in both noisy and clean conditions.
Our method outperforms conventional methods in terms of both intrusive and non-intrusive metrics.
arXiv Detail & Related papers (2023-09-25T04:58:38Z) - Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification.
We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information.
SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z) - Neural Network-augmented Kalman Filtering for Robust Online Speech
Dereverberation in Noisy Reverberant Environments [13.49645012479288]
A neural network-augmented algorithm for noise-robust online dereverberation is proposed.
The presented framework allows for robust dereverberation on a single-channel noisy reverberant dataset.
arXiv Detail & Related papers (2022-04-06T11:38:04Z) - A Passive Similarity based CNN Filter Pruning for Efficient Acoustic
Scene Classification [23.661189257759535]
We present a method to develop low-complexity convolutional neural networks (CNNs) for acoustic scene classification (ASC)
We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs.
The proposed method is simple, reduces computations per inference by 27%, with 25% fewer parameters, with less than 1% drop in accuracy.
arXiv Detail & Related papers (2022-03-29T17:00:06Z) - Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For
Disordered Speech Recognition [57.15942628305797]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems for normal speech.
This paper presents a cross-domain acoustic-to-articulatory (A2A) inversion approach that utilizes the parallel acoustic-articulatory data of the 15-hour TORGO corpus in model training.
Cross-domain adapted to the 102.7-hour UASpeech corpus and to produce articulatory features.
arXiv Detail & Related papers (2022-03-19T08:47:18Z) - Discretization and Re-synthesis: an alternative method to solve the
Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem.
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols.
By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z) - End-to-End Complex-Valued Multidilated Convolutional Neural Network for
Joint Acoustic Echo Cancellation and Noise Suppression [25.04740291728234]
In this paper, we exploit the offset-compensating ability of complex time-frequency masks and propose an end-to-end complex neural network architecture.
We also propose a dual-mask technique for joint echo and noise suppression with simultaneous speech enhancement.
arXiv Detail & Related papers (2021-10-02T07:41:41Z) - Residual acoustic echo suppression based on efficient multi-task
convolutional neural network [0.0]
We propose a real-time residual acoustic echo suppression (RAES) method using an efficient convolutional neural network.
The training criterion is based on a novel loss function, which we call as the suppression loss, to balance the suppression of residual echo and the distortion of near-end signals.
arXiv Detail & Related papers (2020-09-29T11:26:25Z) - Nonlinear Residual Echo Suppression Based on Multi-stream Conv-TasNet [22.56178941790508]
We propose a residual echo suppression method based on the modification of fully convolutional time-domain audio separation network (Conv-TasNet)
Both the residual signal of the linear acoustic echo cancellation system, and the output of the adaptive filter are adopted to form multiple streams for the Conv-TasNet.
arXiv Detail & Related papers (2020-05-15T16:41:16Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.