Related papers: Position: Certified Robustness Does Not (Yet) Imply Model Security

Position: Certified Robustness Does Not (Yet) Imply Model Security

URL: http://arxiv.org/abs/2506.13024v1
Date: Mon, 16 Jun 2025 01:18:33 GMT
Title: Position: Certified Robustness Does Not (Yet) Imply Model Security
Authors: Andrew C. Cullen, Paul Montague, Sarah M. Erfani, Benjamin I. P. Rubinstein,
Abstract summary: certified robustness is promoted as a solution to adversarial examples in Artificial Intelligence systems.<n>We identify critical gaps in current research, including the paradox of detection without distinction.<n>We propose steps to address these fundamental challenges and advance the field toward practical applicability.
Score: 29.595213559303996
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: While certified robustness is widely promoted as a solution to adversarial examples in Artificial Intelligence systems, significant challenges remain before these techniques can be meaningfully deployed in real-world applications. We identify critical gaps in current research, including the paradox of detection without distinction, the lack of clear criteria for practitioners to evaluate certification schemes, and the potential security risks arising from users' expectations surrounding ``guaranteed" robustness claims. This position paper is a call to arms for the certification research community, proposing concrete steps to address these fundamental challenges and advance the field toward practical applicability.

Related papers

Beyond Algorithmic Proofs: Towards Implementation-Level Provable Security [1.338174941551702]
We present Implementation-Level Provable Security, a new paradigm that defines security in terms of structurally verifiable resilience against real-world attack surfaces during deployment.<n>We present SEER (Secure and Efficient Encryption-based Erasure via Ransomware), a file destruction system that repurposes and reinforces the encryption core of Babuk ransomware.
arXiv Detail & Related papers (2025-08-02T01:58:06Z)
Preliminary Investigation into Uncertainty-Aware Attack Stage Classification [81.28215542218724]
This work addresses the problem of attack stage inference under uncertainty.<n>We propose a classification approach based on Evidential Deep Learning (EDL), which models predictive uncertainty by outputting parameters of a Dirichlet distribution over possible stages.<n>Preliminary experiments in a simulated environment demonstrate that the proposed model can accurately infer the stage of an attack with confidence.
arXiv Detail & Related papers (2025-08-01T06:58:00Z)
Towards provable probabilistic safety for scalable embodied AI systems [79.31011047593492]
Embodied AI systems are increasingly prevalent across various applications.<n> Ensuring their safety in complex operating environments remains a major challenge.<n>We introduce provable probabilistic safety, which aims to ensure that the residual risk of large-scale deployment remains below a predefined threshold.
arXiv Detail & Related papers (2025-06-05T15:46:25Z)
Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models [29.569220030102986]
We introduce textbfBeyond Safe Answers (BSA) bench, a novel benchmark comprising 2,000 challenging instances organized into three distinct SSA scenario types.<n> Evaluations of 19 state-of-the-art LRMs demonstrate the difficulty of this benchmark, with top-performing models achieving only 38.0% accuracy in correctly identifying risk rationales.<n>Our work provides a comprehensive assessment tool for evaluating and improving safety reasoning fidelity in LRMs, advancing the development of genuinely risk-aware and reliably safe AI systems.
arXiv Detail & Related papers (2025-05-26T08:49:19Z)
Zero Trust Cybersecurity: Procedures and Considerations in Context [9.9303344240134]
This paper explores the Zero Trust cybersecurity framework, which operates on the principle of never trust, always verify to mitigate vulnerabilities within organizations.<n>It examines the applicability of Zero Trust principles in environments where large volumes of information are exchanged, such as schools and libraries.
arXiv Detail & Related papers (2025-05-24T21:24:46Z)
Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains.<n>Existing research predominantly concentrates on the security of general large language models.<n>This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z)
Towards Trustworthy GUI Agents: A Survey [64.6445117343499]
This survey examines the trustworthiness of GUI agents in five critical dimensions.<n>We identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making.<n>As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential.
arXiv Detail & Related papers (2025-03-30T13:26:00Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Confronting the Reproducibility Crisis: A Case Study of Challenges in Cybersecurity AI [0.0]
A key area in AI-based cybersecurity focuses on defending deep neural networks against malicious perturbations. We attempt to validate results from prior work on certified robustness using the VeriGauge toolkit. Our findings underscore the urgent need for standardized methodologies, containerization, and comprehensive documentation.
arXiv Detail & Related papers (2024-05-29T04:37:19Z)
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety. This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z)
A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols [92.81385447582882]
The Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure. This paper reviews and compares the 15 most prominent authentication protocols for the CAN bus. We evaluate protocols based on essential operational criteria that contribute to ease of implementation.
arXiv Detail & Related papers (2024-01-19T14:52:04Z)
What, Indeed, is an Achievable Provable Guarantee for Learning-Enabled Safety Critical Systems [8.930000909500702]
Machine learning has made remarkable advancements, but confidently utilising learning-enabled components in safety-critical domains still poses challenges. We first discuss the engineering and research challenges associated with the design and verification of such systems. Then, based on the observation that existing works cannot actually achieve provable guarantees, we promote a two-step verification method for the ultimate achievement of provable statistical guarantees.
arXiv Detail & Related papers (2023-07-20T12:40:55Z)
Rethinking Certification for Trustworthy Machine Learning-Based Applications [3.886429361348165]
Machine Learning (ML) is increasingly used to implement advanced applications with non-deterministic behavior. Existing certification schemes are not immediately applicable to non-deterministic applications built on ML models. This article analyzes the challenges and deficiencies of current certification schemes, discusses open research issues, and proposes a first certification scheme for ML-based applications.
arXiv Detail & Related papers (2023-05-26T11:06:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.