Related papers: ANROT-HELANet: Adverserially and Naturally Robust Attention-Based Aggregation Network via The Hellinger Distance for Few-Shot Classification

ANROT-HELANet: Adverserially and Naturally Robust Attention-Based Aggregation Network via The Hellinger Distance for Few-Shot Classification

URL: http://arxiv.org/abs/2509.11220v1
Date: Sun, 14 Sep 2025 11:44:43 GMT
Title: ANROT-HELANet: Adverserially and Naturally Robust Attention-Based Aggregation Network via The Hellinger Distance for Few-Shot Classification
Authors: Gao Yu Lee, Tanmoy Dam, Md Meftahul Ferdaus, Daniel Puiu Poenar, Vu N. Duong,
Abstract summary: We introduce ANROT-HELANet, an Adversarially and Naturally RObusT Hellinger Aggregation Network.<n>Our approach implements an adversarially and naturally robust Hellinger distance-based feature class aggregation scheme.<n>Our approach achieves superior image reconstruction quality with a FID score of 2.75, outperforming traditional VAE (3.43) and WAE (3.38) approaches.
Score: 4.283774189998499
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Few-Shot Learning (FSL), which involves learning to generalize using only a few data samples, has demonstrated promising and superior performances to ordinary CNN methods. While Bayesian based estimation approaches using Kullback-Leibler (KL) divergence have shown improvements, they remain vulnerable to adversarial attacks and natural noises. We introduce ANROT-HELANet, an Adversarially and Naturally RObusT Hellinger Aggregation Network that significantly advances the state-of-the-art in FSL robustness and performance. Our approach implements an adversarially and naturally robust Hellinger distance-based feature class aggregation scheme, demonstrating resilience to adversarial perturbations up to $\epsilon=0.30$ and Gaussian noise up to $\sigma=0.30$. The network achieves substantial improvements across benchmark datasets, including gains of 1.20\% and 1.40\% for 1-shot and 5-shot scenarios on miniImageNet respectively. We introduce a novel Hellinger Similarity contrastive loss function that generalizes cosine similarity contrastive loss for variational few-shot inference scenarios. Our approach also achieves superior image reconstruction quality with a FID score of 2.75, outperforming traditional VAE (3.43) and WAE (3.38) approaches. Extensive experiments conducted on four few-shot benchmarked datasets verify that ANROT-HELANet's combination of Hellinger distance-based feature aggregation, attention mechanisms, and our novel loss function establishes new state-of-the-art performance while maintaining robustness against both adversarial and natural perturbations. Our code repository will be available at https://github.com/GreedYLearner1146/ANROT-HELANet/tree/main.

Related papers

Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations.<n> diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples.<n>We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z)
A Unified Contrastive Loss for Self-Training [3.3454373538792552]
Self-training methods have proven to be effective in exploiting abundant unlabeled data in semi-supervised learning. We propose a general framework to enhance self-training methods, which replaces all instances of CE losses with a unique contrastive loss. Our framework results in significant performance improvements across three different datasets with a limited number of labeled data.
arXiv Detail & Related papers (2024-09-11T14:22:41Z)
MeanSparse: Post-Training Robustness Enhancement Through Mean-Centered Feature Sparsification [32.70084821901212]
MeanSparse is a method to improve the robustness of Convolutional and attention-based Neural Networks against adversarial examples. Our experiments show that MeanSparse achieves a new robustness record of 75.28%.
arXiv Detail & Related papers (2024-06-09T22:14:55Z)
Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria [61.048842737581865]
We show that Adversarial Training (AT) omits to learning robust features, resulting in poor performance of adversarial robustness. We propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention. Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-10-05T07:29:29Z)
When hard negative sampling meets supervised contrastive learning [17.173114048398947]
We introduce a new supervised contrastive learning objective, SCHaNe, which incorporates hard negative sampling during the fine-tuning phase. SchaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks. Our proposed objective sets a new state-of-the-art for base models on ImageNet-1k, achieving an 86.14% accuracy.
arXiv Detail & Related papers (2023-08-28T20:30:10Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z)
MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning [12.365801596593936]
We model our pre-training task as a binary classification problem to induce an implicit contrastive effect.<n>Unlike existing methods, the proposed loss function optimize the mutual information in positive and negative pairs.<n>The proposed method outperforms SOTA self-supervised contrastive frameworks on benchmark datasets.
arXiv Detail & Related papers (2021-11-24T17:51:29Z)
Learning Expectation of Label Distribution for Facial Age and Attractiveness Estimation [65.5880700862751]
We analyze the essential relationship between two state-of-the-art methods (Ranking-CNN and DLDL) and show that the Ranking method is in fact learning label distribution implicitly. We propose a lightweight network architecture and propose a unified framework which can jointly learn facial attribute distribution and regress attribute value. Our method achieves new state-of-the-art results using the single model with 36$times$ fewer parameters and 3$times$ faster inference speed on facial age/attractiveness estimation.
arXiv Detail & Related papers (2020-07-03T15:46:53Z)
Supervised Contrastive Learning [42.27949000093086]
We extend the self-supervised batch contrastive approach to the fully-supervised setting. We analyze two possible versions of the supervised contrastive (SupCon) loss, identifying the best-performing formulation of the loss. On ResNet-200, we achieve top-1 accuracy of 81.4% on the ImageNet dataset, which is 0.8% above the best number reported for this architecture.
arXiv Detail & Related papers (2020-04-23T17:58:56Z)
Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking [83.48804199140758]
We propose a learning-to-mis-rank formulation to perturb the ranking of the system output. We also perform a back-box attack by developing a novel multi-stage network architecture. Our method can control the number of malicious pixels by using differentiable multi-shot sampling.
arXiv Detail & Related papers (2020-04-08T18:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.