Related papers: BiTAA: A Bi-Task Adversarial Attack for Object Detection and Depth Estimation via 3D Gaussian Splatting

BiTAA: A Bi-Task Adversarial Attack for Object Detection and Depth Estimation via 3D Gaussian Splatting

URL: http://arxiv.org/abs/2509.19793v1
Date: Wed, 24 Sep 2025 06:27:15 GMT
Title: BiTAA: A Bi-Task Adversarial Attack for Object Detection and Depth Estimation via 3D Gaussian Splatting
Authors: Yixun Zhang, Feng Zhou, Jianqin Yin,
Abstract summary: We present BiTAA, a bi-task adversarial attack built on 3D Gaussian Splatting.<n>Specifically, we introduce a dual-model attack framework that supports both full-image and patch settings.<n>We also propose a unified evaluation protocol with cross-task transfer metrics and real-world evaluations.
Score: 14.918777539580978
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Camera-based perception is critical to autonomous driving yet remains vulnerable to task-specific adversarial manipulations in object detection and monocular depth estimation. Most existing 2D/3D attacks are developed in task silos, lack mechanisms to induce controllable depth bias, and offer no standardized protocol to quantify cross-task transfer, leaving the interaction between detection and depth underexplored. We present BiTAA, a bi-task adversarial attack built on 3D Gaussian Splatting that yields a single perturbation capable of simultaneously degrading detection and biasing monocular depth. Specifically, we introduce a dual-model attack framework that supports both full-image and patch settings and is compatible with common detectors and depth estimators, with optional expectation-over-transformation (EOT) for physical reality. In addition, we design a composite loss that couples detection suppression with a signed, magnitude-controlled log-depth bias within regions of interest (ROIs) enabling controllable near or far misperception while maintaining stable optimization across tasks. We also propose a unified evaluation protocol with cross-task transfer metrics and real-world evaluations, showing consistent cross-task degradation and a clear asymmetry between Det to Depth and from Depth to Det transfer. The results highlight practical risks for multi-task camera-only perception and motivate cross-task-aware defenses in autonomous driving scenarios.

Related papers

TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder [66.22997415145467]
This paper presents a joint completion and detection framework that improves the detection feature in sparse areas.<n> Specifically, we propose TransBridge, a novel transformer-based up-sampling block that fuses the features from the detection and completion networks.<n>The results show that our framework consistently improves end-to-end 3D object detection, with the mean average precision (mAP) ranging from 0.7 to 1.5 across multiple methods.
arXiv Detail & Related papers (2025-12-12T00:08:03Z)
Robustifying 3D Perception via Least-Squares Graphs for Multi-Agent Object Tracking [43.11267507022928]
This paper proposes a novel mitigation framework on 3D LiDAR scene against adversarial noise.<n>We employ the least-squares graph tool to reduce the induced positional error of each detection's centroid.<n>An extensive evaluation study on the real-world V2V4Real dataset demonstrates that the proposed method significantly outperforms both single and multi-agent tracking frameworks.
arXiv Detail & Related papers (2025-07-07T08:41:08Z)
Seamless Detection: Unifying Salient Object Detection and Camouflaged Object Detection [73.85890512959861]
We propose a task-agnostic framework to unify Salient Object Detection (SOD) and Camouflaged Object Detection (COD)<n>We design a simple yet effective contextual decoder involving the interval-layer and global context, which achieves an inference speed of 67 fps.<n> Experiments on public SOD and COD datasets demonstrate the superiority of our proposed framework in both supervised and unsupervised settings.
arXiv Detail & Related papers (2024-12-22T03:25:43Z)
FuzzRisk: Online Collision Risk Estimation for Autonomous Vehicles based on Depth-Aware Object Detection via Fuzzy Inference [6.856508678236828]
The framework takes two sets of predictions from different algorithms and associates their inconsistencies with the collision risk via fuzzy inference.<n>We experimentally validate that, based on Intersection-over-Union (IoU) and a depth discrepancy measure, the inconsistencies between the two sets of predictions strongly correlate to the error of the 3D object detector.<n>We optimize the fuzzy inference system towards an existing offline metric that matches AV collision rates well.
arXiv Detail & Related papers (2024-11-09T20:20:36Z)
Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector. We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z)
SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning. SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z)
Object Semantics Give Us the Depth We Need: Multi-task Approach to Aerial Depth Completion [1.2239546747355885]
We propose a novel approach to jointly execute the two tasks in a single pass. The proposed method is based on an encoder-focused multi-task learning model that exposes the two tasks to jointly learned features. Experimental results show that the proposed multi-task network outperforms its single-task counterpart.
arXiv Detail & Related papers (2023-04-25T03:21:32Z)
A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial Attacks [84.10546708708554]
3D object detectors are increasingly crucial for security-critical tasks. It is imperative to understand their robustness against adversarial attacks. This paper presents the first comprehensive evaluation and analysis of the robustness of LiDAR-based 3D detectors under adversarial attacks.
arXiv Detail & Related papers (2022-12-20T13:09:58Z)
JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes [75.20435924081585]
JPerceiver can simultaneously estimate scale-aware depth and VO as well as BEV layout from a monocular video sequence. It exploits the cross-view geometric transformation (CGT) to propagate the absolute scale from the road layout to depth and VO. Experiments on Argoverse, Nuscenes and KITTI show the superiority of JPerceiver over existing methods on all the above three tasks.
arXiv Detail & Related papers (2022-07-16T10:33:59Z)
Detecting Adversarial Perturbations in Multi-Task Perception [32.9951531295576]
We propose a novel adversarial perturbation detection scheme based on multi-task perception of complex vision tasks. adversarial perturbations are detected by inconsistencies between extracted edges of the input image, the depth output, and the segmentation output. We show that under an assumption of a 5% false positive rate up to 100% of images are correctly detected as adversarially perturbed, depending on the strength of the perturbation.
arXiv Detail & Related papers (2022-03-02T15:25:17Z)
Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving [87.3492357041748]
In this paper, we showcase practical susceptibilities of multi-sensor detection by placing an adversarial object on top of a host vehicle. Our experiments demonstrate that successful attacks are primarily caused by easily corrupted image features. Towards more robust multi-modal perception systems, we show that adversarial training with feature denoising can boost robustness to such attacks significantly.
arXiv Detail & Related papers (2021-01-17T21:15:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.