Adversarial Patch Generation for Visual-Infrared Dense Prediction Tasks via Joint Position-Color Optimization
- URL: http://arxiv.org/abs/2603.00266v1
- Date: Fri, 27 Feb 2026 19:26:17 GMT
- Title: Adversarial Patch Generation for Visual-Infrared Dense Prediction Tasks via Joint Position-Color Optimization
- Authors: He Li, Wenyue He, Weihang Kong, Xingchen Zhang,
- Abstract summary: We propose a joint position-color optimization framework (AP-PCO) for generating adversarial patches in visual-infrared settings.<n>We introduce a crossmodal color adaptation strategy that constrains patch appearance according to infrared grayscale characteristics.<n> experiments on visual-infrared dense prediction tasks demonstrate that the proposed AP-PCO achieves consistently strong attack performance.
- Score: 14.358458317718174
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multimodal adversarial attacks for dense prediction remain largely underexplored. In particular, visual-infrared (VI) perception systems introduce unique challenges due to heterogeneous spectral characteristics and modality-specific intensity distributions. Existing adversarial patch methods are primarily designed for single-modal inputs and fail to account for crossspectral inconsistencies, leading to reduced attack effectiveness and poor stealthiness when applied to VI dense prediction models. To address these challenges, we propose a joint position-color optimization framework (AP-PCO) for generating adversarial patches in visual-infrared settings. The proposed method optimizes patch placement and color composition simultaneously using a fitness function derived from model outputs, enabling a single patch to perturb both visible and infrared modalities. To further bridge spectral discrepancies, we introduce a crossmodal color adaptation strategy that constrains patch appearance according to infrared grayscale characteristics while maintaining strong perturbations in the visible domain, thereby reducing cross-spectral saliency. The optimization procedure operates without requiring internal model information, supporting flexible black-box attacks. Extensive experiments on visual-infrared dense prediction tasks demonstrate that the proposed AP-PCO achieves consistently strong attack performance across multiple architectures, providing a practical benchmark for robustness evaluation in VI perception systems.
Related papers
- Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark [58.61079960074608]
Existing infrared image enhancement methods focus on tackling individual degradations.<n>All-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness.
arXiv Detail & Related papers (2025-10-10T12:55:54Z) - CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors [6.8163437709379835]
Adversarial patches are widely used to evaluate the robustness of object detection systems in real-world scenarios.<n>We propose CDUPatch, a universal cross-modal patch attack against visible-infrared object detectors across scales, views, and scenarios.<n>By learning an optimal color distribution on the adversarial patch, we can manipulate its thermal response and generate an adversarial infrared texture.
arXiv Detail & Related papers (2025-04-15T05:46:00Z) - DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution [32.53713932204663]
DifIISR is an infrared image super-resolution diffusion model optimized for visual quality and perceptual performance.<n>We introduce an infrared thermal spectrum distribution regulation to preserve visual fidelity.<n>We incorporate various visual foundational models as the perceptual guidance for downstream visual tasks.
arXiv Detail & Related papers (2025-03-03T05:20:57Z) - Adaptive Illumination-Invariant Synergistic Feature Integration in a Stratified Granular Framework for Visible-Infrared Re-Identification [18.221111822542024]
Visible-Infrared Person Re-Identification (VI-ReID) plays a crucial role in applications such as search and rescue, infrastructure protection, and nighttime surveillance.<n>We propose textbfAMINet, an Adaptive Modality Interaction Network.<n>AMINet employs multi-granularity feature extraction to capture comprehensive identity attributes from both full-body and upper-body images.
arXiv Detail & Related papers (2025-02-28T15:42:58Z) - Spectral Enhancement and Pseudo-Anchor Guidance for Infrared-Visible Person Re-Identification [8.054546048450414]
This paper introduces a simple yet effective Spectral Enhancement and Pseudo-anchor Guidance Network, named SEPG-Net.<n>We propose a more homogeneous spectral enhancement scheme based on frequency domain information and greyscale space.<n> Experimental results on two public benchmark datasets demonstrate the superior performance of SEPG-Net against other state-of-the-art methods.
arXiv Detail & Related papers (2024-12-26T08:03:53Z) - Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution [54.293362972473595]
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts.
Current approaches to address SR tasks are either dedicated to extracting RGB image features or assuming similar degradation patterns.
We propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.
arXiv Detail & Related papers (2024-11-19T14:24:03Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
Cross-modality person re-identification (ReID) systems are based on RGB images.<n>Main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.<n>Existing attack methods have primarily focused on the characteristics of the visible image modality.<n>This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Two-stage optimized unified adversarial patch for attacking
visible-infrared cross-modal detectors in the physical world [0.0]
This work introduces the Two-stage Optimized Unified Adversarial Patch (TOUAP) designed for performing attacks against visible-infrared cross-modal detectors in real-world, black-box settings.
arXiv Detail & Related papers (2023-12-04T10:25:34Z) - PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.