Adversarial Attention Perturbations for Large Object Detection Transformers
- URL: http://arxiv.org/abs/2508.02987v1
- Date: Tue, 05 Aug 2025 01:31:10 GMT
- Title: Adversarial Attention Perturbations for Large Object Detection Transformers
- Authors: Zachary Yahn, Selim Furkan Tekin, Fatih Ilhan, Sihao Hu, Tiansheng Huang, Yichang Xu, Margaret Loper, Ling Liu,
- Abstract summary: Adversarial perturbations are useful tools for exposing vulnerabilities in neural networks.<n>This paper presents an Attention-Focused Offensive Gradient (AFOG) attack against object detection transformers.
- Score: 6.845910470068847
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Adversarial perturbations are useful tools for exposing vulnerabilities in neural networks. Existing adversarial perturbation methods for object detection are either limited to attacking CNN-based detectors or weak against transformer-based detectors. This paper presents an Attention-Focused Offensive Gradient (AFOG) attack against object detection transformers. By design, AFOG is neural-architecture agnostic and effective for attacking both large transformer-based object detectors and conventional CNN-based detectors with a unified adversarial attention framework. This paper makes three original contributions. First, AFOG utilizes a learnable attention mechanism that focuses perturbations on vulnerable image regions in multi-box detection tasks, increasing performance over non-attention baselines by up to 30.6%. Second, AFOG's attack loss is formulated by integrating two types of feature loss through learnable attention updates with iterative injection of adversarial perturbations. Finally, AFOG is an efficient and stealthy adversarial perturbation method. It probes the weak spots of detection transformers by adding strategically generated and visually imperceptible perturbations which can cause well-trained object detection models to fail. Extensive experiments conducted with twelve large detection transformers on COCO demonstrate the efficacy of AFOG. Our empirical results also show that AFOG outperforms existing attacks on transformer-based and CNN-based object detectors by up to 83% with superior speed and imperceptibility. Code is available at https://github.com/zacharyyahn/AFOG.
Related papers
- Evaluating the Adversarial Robustness of Detection Transformers [4.3012765978447565]
Despite the advancements in object detection transformers (DETRs), their robustness against adversarial attacks remains underexplored.<n>This paper presents a comprehensive evaluation of DETR model and its variants under both white-box and black-box adversarial attacks.<n>Our analysis reveals high intra-network transferability among DETR variants, but limited cross-network transferability to CNN-based models.
arXiv Detail & Related papers (2024-12-25T00:31:10Z) - Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection [4.269334070603315]
We propose a realistic-like Robust Black-box Adrial attack (R$2$BA) with post-processing fusion optimization.<n>We show that R$2$BA exhibits impressive anti-detection performance, excellent invisibility, and strong robustness in GAN-based and diffusion-based cases.
arXiv Detail & Related papers (2024-12-09T18:16:50Z) - Twin Trigger Generative Networks for Backdoor Attacks against Object Detection [14.578800906364414]
Object detectors, which are widely used in real-world applications, are vulnerable to backdoor attacks.
Most research on backdoor attacks has focused on image classification, with limited investigation into object detection.
We propose novel twin trigger generative networks to generate invisible triggers for implanting backdoors into models during training, and visible triggers for steady activation during inference.
arXiv Detail & Related papers (2024-11-23T03:46:45Z) - Detector Collapse: Physical-World Backdooring Object Detection to Catastrophic Overload or Blindness in Autonomous Driving [17.637155085620634]
Detector Collapse (DC) is a brand-new backdoor attack paradigm tailored for object detection.
DC is designed to instantly incapacitate detectors (i.e., severely impairing detector's performance and culminating in a denial-of-service)
We introduce a novel poisoning strategy exploiting natural objects, enabling DC to act as a practical backdoor in real-world environments.
arXiv Detail & Related papers (2024-04-17T13:12:14Z) - Threatening Patch Attacks on Object Detection in Optical Remote Sensing
Images [55.09446477517365]
Advanced Patch Attacks (PAs) on object detection in natural images have pointed out the great safety vulnerability in methods based on deep neural networks.
We propose a more Threatening PA without the scarification of the visual quality, dubbed TPA.
To the best of our knowledge, this is the first attempt to study the PAs on object detection in O-RSIs, and we hope this work can get our readers interested in studying this topic.
arXiv Detail & Related papers (2023-02-13T02:35:49Z) - Focused Decoding Enables 3D Anatomical Detection by Transformers [64.36530874341666]
We propose a novel Detection Transformer for 3D anatomical structure detection, dubbed Focused Decoder.
Focused Decoder leverages information from an anatomical region atlas to simultaneously deploy query anchors and restrict the cross-attention's field of view.
We evaluate our proposed approach on two publicly available CT datasets and demonstrate that Focused Decoder not only provides strong detection results and thus alleviates the need for a vast amount of annotated data but also exhibits exceptional and highly intuitive explainability of results via attention weights.
arXiv Detail & Related papers (2022-07-21T22:17:21Z) - Efficient Decoder-free Object Detection with Transformers [75.00499377197475]
Vision transformers (ViTs) are changing the landscape of object detection approaches.
We propose a decoder-free fully transformer-based (DFFT) object detector.
DFFT_SMALL achieves high efficiency in both training and inference stages.
arXiv Detail & Related papers (2022-06-14T13:22:19Z) - An Extendable, Efficient and Effective Transformer-based Object Detector [95.06044204961009]
We integrate Vision and Detection Transformers (ViDT) to construct an effective and efficient object detector.
ViDT introduces a reconfigured attention module to extend the recent Swin Transformer to be a standalone object detector.
We extend it to ViDT+ to support joint-task learning for object detection and instance segmentation.
arXiv Detail & Related papers (2022-04-17T09:27:45Z) - DA-DETR: Domain Adaptive Detection Transformer with Information Fusion [53.25930448542148]
DA-DETR is a domain adaptive object detection transformer that introduces information fusion for effective transfer from a labeled source domain to an unlabeled target domain.
We introduce a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.
CTBlender employs the Transformer features to modulate the CNN features across multiple scales where the high-level semantic information and the low-level spatial information are fused for accurate object identification and localization.
arXiv Detail & Related papers (2021-03-31T13:55:56Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z) - Transformer-Encoder Detector Module: Using Context to Improve Robustness
to Adversarial Attacks on Object Detection [12.521662223741673]
This article proposes a new context module that can be applied to an object detector to improve the labeling of object instances.
The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13% compared to the baseline Faster-RCNN detector.
arXiv Detail & Related papers (2020-11-13T15:52:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.