Related papers: Deep Active Speech Cancellation with Mamba-Masking Network

Related papers

Backscatter Device-aided Integrated Sensing and Communication: A Pareto Optimization Framework [59.30060797118097]
Integrated sensing and communication (ISAC) systems potentially encounter significant performance degradation in densely obstructed urban non-line-of-sight scenarios.<n>This paper proposes a backscatter approximation (BD)-assisted ISAC system, which leverages passive BDs naturally distributed in environments of enhancement.
arXiv Detail & Related papers (2025-07-12T17:11:06Z)
Active Speech Enhancement: Active Speech Denoising Decliping and Deveraberation [13.575063025878208]
We introduce a new paradigm for active sound modification: Active Speech Enhancement (ASE)<n>We propose a novel Transformer-Mamba-based architecture, along with a task-specific loss function designed to jointly optimize interference suppression and signal enrichment.<n>Our method outperforms existing baselines across multiple speech processing tasks -- including denoising, dereverberation, and declipping.
arXiv Detail & Related papers (2025-05-22T17:10:18Z)
Non-contact Vital Signs Detection in Dynamic Environments [0.61915796293339]
We propose a novel DC offset calibration method alongside a Hilbert and Differential Cross-Multiply (HADCM) demodulation algorithm.<n>The approach estimates time-varying DC offsets from neighboring signal peaks and valleys, then employs both differential forms and Hilbert transforms of the I/Q channel signals to extract vital sign information.
arXiv Detail & Related papers (2025-05-13T09:11:48Z)
Unsupervised CP-UNet Framework for Denoising DAS Data with Decay Noise [13.466125373185399]
Distributed acoustic sensor (DAS) technology leverages optical fiber cables to detect acoustic signals. DAS exhibits a lower signal-to-noise ratio (S/N) compared to geophones. This reduced S/N can negatively impact data analyses containing inversion and interpretation.
arXiv Detail & Related papers (2025-02-19T03:09:49Z)
DenoMAE: A Multimodal Autoencoder for Denoising Modulation Signals [21.25974800554959]
DenoMAE is a novel framework for denoising modulation signals during pretraining.<n>It incorporates multiple input modalities, including noise, to enhance cross-modal learning.<n>It achieves state-of-the-art accuracy in automatic modulation classification tasks.
arXiv Detail & Related papers (2025-01-20T15:23:16Z)
UAV Virtual Antenna Array Deployment for Uplink Interference Mitigation in Data Collection Networks [71.23793087286703]
Unmanned aerial vehicles (UAVs) have gained considerable attention as a platform for establishing aerial wireless networks and communications.<n>This paper explores a novel uplink interference mitigation approach based on the collaborative beamforming (CB) method in multi-UAV network systems.
arXiv Detail & Related papers (2024-12-09T12:56:50Z)
Robust Federated Learning Over the Air: Combating Heavy-Tailed Noise with Median Anchored Clipping [57.40251549664762]
We propose a novel gradient clipping method, termed Median Anchored Clipping (MAC), to combat the detrimental effects of heavy-tailed noise.<n>We also derive analytical expressions for the convergence rate of model training with analog over-the-air federated learning under MAC.
arXiv Detail & Related papers (2024-09-23T15:11:40Z)
RIMformer: An End-to-End Transformer for FMCW Radar Interference Mitigation [1.8063750621475454]
A novel FMCW radar interference mitigation method, termed as RIMformer, is proposed by using an end-to-end Transformer-based structure. The architecture is designed to process time-domain IF signals in an end-to-end manner. The results show that the proposed RIMformer can effectively mitigate interference and restore the target signals.
arXiv Detail & Related papers (2024-07-16T07:51:20Z)
DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z)
AMC-Net: An Effective Network for Automatic Modulation Classification [22.871024969842335]
We propose a novel AMC-Net that improves recognition by denoising the input signal in the frequency domain while performing multi-scale and effective feature extraction. Experiments on two representative datasets demonstrate that our model performs better in efficiency and effectiveness than the most current methods.
arXiv Detail & Related papers (2023-04-02T04:26:30Z)
Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation [23.758202121043805]
We propose a novel network to unify speech enhancement and separation with gradient modulation to improve noise-robustness. Experimental results show that our approach achieves the state-of-the-art on large-scale Libri2Mix- and Libri3Mix-noisy datasets.
arXiv Detail & Related papers (2023-02-22T03:54:50Z)
Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification. We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information. SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z)
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement [43.477179521051355]
We propose an extended single-channel real-time speech enhancement framework called FullSubNet+. The experimental results in DNS Challenge dataset show the superior performance of our FullSubNet+.
arXiv Detail & Related papers (2022-03-23T04:33:09Z)
Time-domain Speech Enhancement with Generative Adversarial Learning [53.74228907273269]
This paper proposes a new framework called Time-domain Speech Enhancement Generative Adversarial Network (TSEGAN) TSEGAN is an extension of the generative adversarial network (GAN) in time-domain with metric evaluation to mitigate the scaling problem. In addition, we provide a new method based on objective function mapping for the theoretical analysis of the performance of Metric GAN.
arXiv Detail & Related papers (2021-03-30T08:09:49Z)
Digital Beamforming Robust to Time-Varying Carrier Frequency Offset [21.18926642388997]
We present novel beamforming algorithms that are robust to signal corruptions arising from a time-variant carrier frequency offset. We propose two atomic-norm-minimization (ANM)-based methods to design a weight vector that can be used to cancel interference when there exist unknown time-varying frequency drift in the pilot and interferer signals.
arXiv Detail & Related papers (2021-03-08T18:08:56Z)
CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application [63.2243126704342]
This study presents a deep learning-based speech signal-processing mobile application known as CITISEN. The CITISEN provides three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC) Compared with the noisy speech signals, the enhanced speech signals achieved about 6% and 33% of improvements.
arXiv Detail & Related papers (2020-08-21T02:04:12Z)
Improving Stability of LS-GANs for Audio and Speech Signals [70.15099665710336]
We show that encoding departure from normality computed in this vector space into the generator optimization formulation helps to craft more comprehensive spectrograms. We demonstrate the effectiveness of binding this metric for enhancing stability in training with less mode collapse compared to baseline GANs.
arXiv Detail & Related papers (2020-08-12T17:41:25Z)
Simultaneous Denoising and Dereverberation Using Deep Embedding Features [64.58693911070228]
We propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features. At the denoising stage, the DC network is leveraged to extract noise-free deep embedding features. At the dereverberation stage, instead of using the unsupervised K-means clustering algorithm, another neural network is utilized to estimate the anechoic speech.
arXiv Detail & Related papers (2020-04-06T06:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.