Related papers: Acoustic Echo Cancellation by Combining Adaptive Digital Filter and Recurrent Neural Network

Acoustic Echo Cancellation by Combining Adaptive Digital Filter and Recurrent Neural Network

URL: http://arxiv.org/abs/2005.09237v1
Date: Tue, 19 May 2020 06:25:52 GMT
Title: Acoustic Echo Cancellation by Combining Adaptive Digital Filter and Recurrent Neural Network
Authors: Lu Ma, Hua Huang, Pei Zhao, Tengrong Su
Abstract summary: A fusion scheme by combining adaptive filter and neural network is proposed for Acoustic Echo Cancellation. The echo could be reduced in a large scale by adaptive filtering, resulting in little residual echo. The neural network is elaborately designed and trained for suppressing such residual echo.
Score: 11.335343110341354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Acoustic Echo Cancellation (AEC) plays a key role in voice interaction. Due to the explicit mathematical principle and intelligent nature to accommodate conditions, adaptive filters with different types of implementations are always used for AEC, giving considerable performance. However, there would be some kinds of residual echo in the results, including linear residue introduced by mismatching between estimation and the reality and non-linear residue mostly caused by non-linear components on the audio devices. The linear residue can be reduced with elaborate structure and methods, leaving the non-linear residue intractable for suppression. Though, some non-linear processing methods have already be raised, they are complicated and inefficient for suppression, and would bring damage to the speech audio. In this paper, a fusion scheme by combining adaptive filter and neural network is proposed for AEC. The echo could be reduced in a large scale by adaptive filtering, resulting in little residual echo. Though it is much smaller than speech audio, it could also be perceived by human ear and would make communication annoy. The neural network is elaborately designed and trained for suppressing such residual echo. Experiments compared with prevailing methods are conducted, validating the effectiveness and superiority of the proposed combination scheme.

Related papers

Run-Time Adaptation of Neural Beamforming for Robust Speech Dereverberation and Denoising [15.152748065111194]
This paper describes speech enhancement for realtime automatic speech recognition in real environments. It estimates the masks of clean dry speech from a noisy echoic mixture spectrogram with a deep neural network (DNN) and then computes a enhancement filter used for beamforming. The performance of such a supervised approach, however, is drastically degraded under mismatched conditions.
arXiv Detail & Related papers (2024-10-30T08:32:47Z)
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization [60.43992089087448]
Dysarthric speech reconstruction systems aim to automatically convert dysarthric speech into normal-sounding speech. We propose a Unit-DSR system, which harnesses the powerful domain-adaptation capacity of HuBERT for training efficiency improvement. Compared with NED approaches, the Unit-DSR system only consists of a speech unit normalizer and a Unit HiFi-GAN vocoder, which is considerably simpler without cascaded sub-modules or auxiliary tasks.
arXiv Detail & Related papers (2024-01-26T06:08:47Z)
Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification. We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information. SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z)
A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification [23.661189257759535]
We present a method to develop low-complexity convolutional neural networks (CNNs) for acoustic scene classification (ASC) We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs. The proposed method is simple, reduces computations per inference by 27%, with 25% fewer parameters, with less than 1% drop in accuracy.
arXiv Detail & Related papers (2022-03-29T17:00:06Z)
Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition [57.15942628305797]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems for normal speech. This paper presents a cross-domain acoustic-to-articulatory (A2A) inversion approach that utilizes the parallel acoustic-articulatory data of the 15-hour TORGO corpus in model training. Cross-domain adapted to the 102.7-hour UASpeech corpus and to produce articulatory features.
arXiv Detail & Related papers (2022-03-19T08:47:18Z)
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem. Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols. By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z)
End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression [25.04740291728234]
In this paper, we exploit the offset-compensating ability of complex time-frequency masks and propose an end-to-end complex neural network architecture. We also propose a dual-mask technique for joint echo and noise suppression with simultaneous speech enhancement.
arXiv Detail & Related papers (2021-10-02T07:41:41Z)
Residual acoustic echo suppression based on efficient multi-task convolutional neural network [0.0]
We propose a real-time residual acoustic echo suppression (RAES) method using an efficient convolutional neural network. The training criterion is based on a novel loss function, which we call as the suppression loss, to balance the suppression of residual echo and the distortion of near-end signals.
arXiv Detail & Related papers (2020-09-29T11:26:25Z)
Nonlinear Residual Echo Suppression Based on Multi-stream Conv-TasNet [22.56178941790508]
We propose a residual echo suppression method based on the modification of fully convolutional time-domain audio separation network (Conv-TasNet) Both the residual signal of the linear acoustic echo cancellation system, and the output of the adaptive filter are adopted to form multiple streams for the Conv-TasNet.
arXiv Detail & Related papers (2020-05-15T16:41:16Z)
Filter Grafting for Deep Neural Networks: Reason, Method, and Cultivation [86.91324735966766]
Filter is the key component in modern convolutional neural networks (CNNs) In this paper, we introduce filter grafting (textbfMethod) to achieve this goal. We develop a novel criterion to measure the information of filters and an adaptive weighting strategy to balance the grafted information among networks.
arXiv Detail & Related papers (2020-04-26T08:36:26Z)
Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals. Two main challenges are the complex acoustic environment and the real-time processing requirement. We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.