DenoMAE2.0: Improving Denoising Masked Autoencoders by Classifying Local Patches
- URL: http://arxiv.org/abs/2502.18202v1
- Date: Tue, 25 Feb 2025 13:41:56 GMT
- Title: DenoMAE2.0: Improving Denoising Masked Autoencoders by Classifying Local Patches
- Authors: Atik Faysal, Mohammad Rostami, Taha Boushine, Reihaneh Gh. Roshan, Huaxia Wang, Nikhil Muralidhar,
- Abstract summary: We introduce DenoMAE2.0, an enhanced denoising masked autoencoder that integrates a local patch classification objective alongside traditional reconstruction loss to improve representation learning and robustness.<n>This dual-objective approach is particularly beneficial in semi-supervised learning for wireless communication, where high noise levels and data scarcity pose significant challenges.
- Score: 18.14388359413206
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce DenoMAE2.0, an enhanced denoising masked autoencoder that integrates a local patch classification objective alongside traditional reconstruction loss to improve representation learning and robustness. Unlike conventional Masked Autoencoders (MAE), which focus solely on reconstructing missing inputs, DenoMAE2.0 introduces position-aware classification of unmasked patches, enabling the model to capture fine-grained local features while maintaining global coherence. This dual-objective approach is particularly beneficial in semi-supervised learning for wireless communication, where high noise levels and data scarcity pose significant challenges. We conduct extensive experiments on modulation signal classification across a wide range of signal-to-noise ratios (SNRs), from extremely low to moderately high conditions and in a low data regime. Our results demonstrate that DenoMAE2.0 surpasses its predecessor, Deno-MAE, and other baselines in both denoising quality and downstream classification accuracy. DenoMAE2.0 achieves a 1.1% improvement over DenoMAE on our dataset and 11.83%, 16.55% significant improved accuracy gains on the RadioML benchmark, over DenoMAE, for constellation diagram classification of modulation signals.
Related papers
- $C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction [80.57232374640911]
We propose a model-agnostic strategy called the Mask-And-Recover (MAR)
MAR integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules.
To better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model.
arXiv Detail & Related papers (2025-04-01T13:01:30Z) - DenoMAE: A Multimodal Autoencoder for Denoising Modulation Signals [21.25974800554959]
DenoMAE is a novel framework for denoising modulation signals during pretraining.<n>It incorporates multiple input modalities, including noise, to enhance cross-modal learning.<n>It achieves state-of-the-art accuracy in automatic modulation classification tasks.
arXiv Detail & Related papers (2025-01-20T15:23:16Z) - FM2S: Towards Spatially-Correlated Noise Modeling in Zero-Shot Fluorescence Microscopy Image Denoising [33.383511185170214]
Fluorescence Micrograph to Self (FM2S) is a zero-shot denoiser that achieves efficient Fluorescence Micrograph to Self (FM2S) denoising through three key innovations.
Experiments across FMI datasets demonstrate FM2S's superiority: It outperforms CVF-SID by 1.4dB PSNR on average while requiring 0.1% parameters of AP-BSN.
arXiv Detail & Related papers (2024-12-13T10:45:25Z) - DenoMamba: A fused state-space model for low-dose CT denoising [6.468495781611433]
Low-dose computed tomography (LDCT) lower potential risks linked to radiation exposure.
LDCT denoising is based on neural network models that learn data-driven image priors to separate noise evoked by dose reduction from underlying tissue signals.
DenoMamba is a novel denoising method based on state-space modeling (SSM) that efficiently captures short- and long-range context in medical images.
arXiv Detail & Related papers (2024-09-19T21:32:07Z) - Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders [0.8437187555622164]
We introduce DAAMAudioCNNLSTM and DAAMAudioTransformer, two parameter efficient and explainable models for audio feature extraction and depression detection.
Both models' significant explainability and efficiency in leveraging speech signals for depression detection represent a leap towards more reliable, clinically useful diagnostic tools.
arXiv Detail & Related papers (2024-08-31T08:50:28Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - UDHF2-Net: Uncertainty-diffusion-model-based High-Frequency TransFormer Network for Remotely Sensed Imagery Interpretation [12.24506241611653]
Uncertainty-diffusion-model-based high-Frequency TransFormer network (UDHF2-Net) is the first to be proposed.
UDHF2-Net is a spatially-stationary-and-non-stationary high-frequency connection paradigm (SHCP)
Mask-and-geo-knowledge-based uncertainty diffusion module (MUDM) is a self-supervised learning strategy.
A frequency-wise semi-pseudo-Siamese UDHF2-Net is the first to be proposed to balance accuracy and complexity for change detection.
arXiv Detail & Related papers (2024-06-23T15:03:35Z) - Weakly supervised segmentation of intracranial aneurysms using a novel 3D focal modulation UNet [0.5106162890866905]
We propose FocalSegNet, a novel 3D focal modulation UNet, to detect an aneurysm and offer an initial, coarse segmentation of it from time-of-flight MRA image patches.
We trained and evaluated our model on a public dataset, and in terms of UIA detection, our model showed a low false-positive rate of 0.21 and a high sensitivity of 0.80.
arXiv Detail & Related papers (2023-08-06T03:28:08Z) - CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement [5.766499647507758]
We further develop the conformer-based metric generative adversarial network (CMGAN) model for speech enhancement (SE) in the time-frequency (TF) domain.
Our findings show that CMGAN outperforms existing state-of-the-art methods in the three major speech enhancement tasks: denoising, dereverberation, and super-resolution.
arXiv Detail & Related papers (2022-09-22T15:50:21Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - Training Classifiers that are Universally Robust to All Label Noise
Levels [91.13870793906968]
Deep neural networks are prone to overfitting in the presence of label noise.
We propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning.
Our framework generally outperforms at medium to high noise levels.
arXiv Detail & Related papers (2021-05-27T13:49:31Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.