Related papers: Quantifying Security Vulnerabilities: A Metric-Driven Security Analysis of Gaps in Current AI Standards

Related papers

ForesightSafety Bench: A Frontier Risk Evaluation and Governance Framework towards Safe AI [38.70363180741332]
"ForesightSafety Bench" is a safety evaluation framework for cutting-edge AI models.<n>The benchmark has accumulated tens of thousands of structured risk data points and assessment results.<n>Based on this benchmark, we conduct systematic evaluation and in-depth analysis of over twenty mainstream advanced large models.
arXiv Detail & Related papers (2026-02-15T13:12:44Z)
Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z)
Evaluating AI Companies' Frontier Safety Frameworks: Methodology and Results [0.0]
12 AI companies published frontier safety frameworks outlining approaches to managing catastrophic risks from advanced AI systems.<n>We develop a 65-criteria assessment methodology grounded in established risk management principles from safety-critical industries.<n>We evaluate the twelve frameworks across four dimensions: risk identification, risk analysis and evaluation, risk treatment, and risk governance.
arXiv Detail & Related papers (2025-12-01T00:55:18Z)
Standardized Threat Taxonomy for AI Security, Governance, and Regulatory Compliance [0.0]
"Language barrier" currently separates technical security teams, who focus on algorithmic vulnerabilities, from legal and compliance professionals, who address regulatory mandates.<n>This research presents the AI System Threat Vector taxonomy, a structured ontology designed explicitly for Quantitative Risk Assessment (QRA)<n>The framework categorizes AI-specific risks into nine critical domains: Misuse, Poisoning, Privacy, Adrial, Biases, Unreliable Outputs, Drift, Supply Chain, and IP Threat, integrating 53 operationally defined sub-threats.
arXiv Detail & Related papers (2025-11-26T20:42:46Z)
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z)
Bench-2-CoP: Can We Trust Benchmarking for EU AI Compliance? [2.010294990327175]
Current AI evaluation practices depend heavily on established benchmarks.<n>These tools were not designed to measure the systemic risks that are the focus of the new regulatory landscape.<n>This research addresses the urgent need to quantify this "benchmark-regulation gap"
arXiv Detail & Related papers (2025-08-07T15:03:39Z)
Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models [29.569220030102986]
We introduce textbfBeyond Safe Answers (BSA) bench, a novel benchmark comprising 2,000 challenging instances organized into three distinct SSA scenario types.<n> Evaluations of 19 state-of-the-art LRMs demonstrate the difficulty of this benchmark, with top-performing models achieving only 38.0% accuracy in correctly identifying risk rationales.<n>Our work provides a comprehensive assessment tool for evaluating and improving safety reasoning fidelity in LRMs, advancing the development of genuinely risk-aware and reliably safe AI systems.
arXiv Detail & Related papers (2025-05-26T08:49:19Z)
Safety Co-Option and Compromised National Security: The Self-Fulfilling Prophecy of Weakened AI Risk Thresholds [0.0]
We show how "safety revisionism" has allowed AI technologists to engage in "safety revisionism" We explore how the current trajectory for AI risk determination and evaluation for foundation model use within national security is poised for a race to the bottom.
arXiv Detail & Related papers (2025-04-21T13:20:56Z)
strideSEA: A STRIDE-centric Security Evaluation Approach [1.996354642790599]
strideSEA integrates STRIDE as the central classification scheme into the security activities of threat modeling, attack scenario analysis, risk analysis, and countermeasure recommendation. The application of strideSEA is demonstrated in a real-world online immunization system case study.
arXiv Detail & Related papers (2025-03-24T18:00:17Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability. The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
Position: Mind the Gap-the Growing Disconnect Between Established Vulnerability Disclosure and AI Security [56.219994752894294]
We argue that adapting existing processes for AI security reporting is doomed to fail due to fundamental shortcomings for the distinctive characteristics of AI systems.<n>Based on our proposal to address these shortcomings, we discuss an approach to AI security reporting and how the new AI paradigm, AI agents, will further reinforce the need for specialized AI security incident reporting advancements.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Engineering Trustworthy AI: A Developer Guide for Empirical Risk Minimization [53.80919781981027]
Key requirements for trustworthy AI can be translated into design choices for the components of empirical risk minimization. We hope to provide actionable guidance for building AI systems that meet emerging standards for trustworthiness of AI.
arXiv Detail & Related papers (2024-10-25T07:53:32Z)
Towards Assuring EU AI Act Compliance and Adversarial Robustness of LLMs [1.368472250332885]
Large language models are prone to misuse and vulnerable to security threats. The European Union's Artificial Intelligence Act seeks to enforce AI robustness in certain contexts.
arXiv Detail & Related papers (2024-10-04T18:38:49Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Responsible AI Question Bank: A Comprehensive Tool for AI Risk Assessment [18.966590454042272]
The study introduces our Responsible AI (RAI) Question Bank, a comprehensive framework and tool designed to support diverse AI initiatives.<n>By integrating AI ethics principles such as fairness, transparency, and accountability into a structured question format, the RAI Question Bank aids in identifying potential risks.
arXiv Detail & Related papers (2024-08-02T22:40:20Z)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context. We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z)
A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities [0.29998889086656577]
The relentless process of tracking and remediating vulnerabilities is a top concern for cybersecurity professionals. We provide a framework for vulnerability management specifically focused on mitigating threats using adversary criteria derived from MITRE ATT&CK. Our results show an average 71.5% - 91.3% improvement towards the identification of vulnerabilities likely to be targeted and exploited by cyber threat actors.
arXiv Detail & Related papers (2024-06-09T23:29:12Z)
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z)
ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection. We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance. We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z)
QB4AIRA: A Question Bank for AI Risk Assessment [19.783485414942284]
QB4AIRA comprises 293 prioritized questions covering a wide range of AI risk areas. It serves as a valuable resource for stakeholders in assessing and managing AI risks.
arXiv Detail & Related papers (2023-05-16T09:18:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.