On the Adversarial Robustness of Learning-based Conformal Novelty Detection
- URL: http://arxiv.org/abs/2510.00463v1
- Date: Wed, 01 Oct 2025 03:29:11 GMT
- Title: On the Adversarial Robustness of Learning-based Conformal Novelty Detection
- Authors: Daofu Zhang, Mehrdad Pournaderi, Hanne M. Clifford, Yu Xiang, Pramod K. Varshney,
- Abstract summary: We study the adversarial robustness of conformal novelty detection using AdaDetect.<n>Our results show that adversarial perturbations can significantly increase the FDR while maintaining high detection power.
- Score: 10.58528988397402
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper studies the adversarial robustness of conformal novelty detection. In particular, we focus on AdaDetect, a powerful learning-based framework for novelty detection with finite-sample false discovery rate (FDR) control. While AdaDetect provides rigorous statistical guarantees under benign conditions, its behavior under adversarial perturbations remains unexplored. We first formulate an oracle attack setting that quantifies the worst-case degradation of FDR, deriving an upper bound that characterizes the statistical cost of attacks. This idealized formulation directly motivates a practical and effective attack scheme that only requires query access to AdaDetect's output labels. Coupling these formulations with two popular and complementary black-box adversarial algorithms, we systematically evaluate the vulnerability of AdaDetect on synthetic and real-world datasets. Our results show that adversarial perturbations can significantly increase the FDR while maintaining high detection power, exposing fundamental limitations of current error-controlled novelty detection methods and motivating the development of more robust alternatives.
Related papers
- GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients [0.1019561860229868]
In this paper, we investigate the geometric properties of a model's input loss landscape.<n>We reveal a distinct and consistent difference in the ID for natural and adversarial data, which forms the basis of our proposed detection method.<n>Our detector significantly surpasses existing methods against a wide array of attacks, including CW and AutoAttack, achieving detection rates consistently above 92% on CIFAR-10.
arXiv Detail & Related papers (2025-12-14T20:16:03Z) - T-Detect: Tail-Aware Statistical Normalization for Robust Detection of Adversarial Machine-Generated Text [16.983417848762826]
Large language models (LLMs) have shown the capability to generate fluent and logical content, presenting significant challenges to machine-generated text detection.<n>In this paper, we introduce T-Detect, a novel detection method that fundamentally redesigns the curvature-based detectors.<n>Our contributions are a new, theoretically-justified statistical foundation for text detection, an ablation-validated method that demonstrates superior robustness, and a comprehensive analysis of its performance under adversarial conditions.
arXiv Detail & Related papers (2025-07-31T14:08:04Z) - TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts.<n>Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z) - Addressing Key Challenges of Adversarial Attacks and Defenses in the Tabular Domain: A Methodological Framework for Coherence and Consistency [25.830427564563422]
Class-Specific Anomaly Detection (CSAD) is an effective novel anomaly detection approach.<n> CSAD evaluates adversarial samples relative to their predicted class distribution, rather than a broad benign distribution.<n>Our evaluation incorporates both anomaly detection rates with SHAP-based assessments to provide a more comprehensive measure of adversarial sample quality.
arXiv Detail & Related papers (2024-12-10T09:17:09Z) - Hide in Plain Sight: Clean-Label Backdoor for Auditing Membership Inference [16.893873979953593]
We propose a novel clean-label backdoor-based approach for stealthy data auditing.
Our approach employs an optimal trigger generated by a shadow model that mimics target model's behavior.
The proposed method enables robust data auditing through blackbox access, achieving high attack success rates across diverse datasets.
arXiv Detail & Related papers (2024-11-24T20:56:18Z) - ADoPT: LiDAR Spoofing Attack Detection Based on Point-Level Temporal
Consistency [11.160041268858773]
Deep neural networks (DNNs) are increasingly integrated into LiDAR-based perception systems for autonomous vehicles (AVs)
We aim to address the challenge of LiDAR spoofing attacks, where attackers inject fake objects into LiDAR data and fool AVs to misinterpret their environment and make erroneous decisions.
We propose ADoPT (Anomaly Detection based on Point-level Temporal consistency), which quantitatively measures temporal consistency across consecutive frames and identifies abnormal objects based on the coherency of point clusters.
In our evaluation using the nuScenes dataset, our algorithm effectively counters various LiDAR spoofing attacks, achieving a low (
arXiv Detail & Related papers (2023-10-23T02:31:31Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Balancing detectability and performance of attacks on the control
channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs)
This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.