Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance
- URL: http://arxiv.org/abs/2508.08789v4
- Date: Mon, 18 Aug 2025 09:25:19 GMT
- Title: Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance
- Authors: Yuchu Jiang, Jian Zhao, Yuchen Yuan, Tianle Zhang, Yao Huang, Yanghao Zhang, Yan Wang, Yanshu Li, Xizhong Guo, Yusheng Zhao, Jun Zhang, Zhi Zhang, Xiaojian Lin, Yixiu Zou, Haoxuan Ma, Yuhu Shang, Yuzhi Hu, Keshu Cai, Ruochen Zhang, Boyuan Chen, Yilan Gao, Ziheng Jiao, Yi Qin, Shuangjun Du, Xiao Tong, Zhekun Liu, Yu Chen, Xuankun Rong, Rui Wang, Yejie Zheng, Zhaoxin Fan, Murat Sensoy, Hongyuan Zhang, Pan Zhou, Lei Jin, Hao Zhao, Xu Yang, Jiaojiao Zhao, Jianshu Li, Joey Tianyi Zhou, Zhi-Qi Cheng, Longtao Huang, Zhiyi Liu, Zheng Zhu, Jianan Li, Gang Wang, Qi Li, Xu-Yao Zhang, Yaodong Yang, Mang Ye, Wenqi Ren, Zhaofeng He, Hang Su, Rongrong Ni, Liping Jing, Xingxing Wei, Junliang Xing, Massimo Alioto, Shengmei Shen, Petia Radeva, Dacheng Tao, Ya-Qin Zhang, Shuicheng Yan, Chi Zhang, Zhongjiang He, Xuelong Li,
- Abstract summary: We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
- Score: 211.5823259429128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid advancement of AI has expanded its capabilities across domains, yet introduced critical technical vulnerabilities, such as algorithmic bias and adversarial sensitivity, that pose significant societal risks, including misinformation, inequity, security breaches, physical harm, and eroded public trust. These challenges highlight the urgent need for robust AI governance. We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security (system reliability), Derivative Security (real-world harm mitigation), and Social Ethics (value alignment and accountability). Uniquely, our approach unifies technical methods, emerging evaluation benchmarks, and policy insights to promote transparency, accountability, and trust in AI systems. Through a systematic review of over 300 studies, we identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight. These shortcomings stem from treating governance as an afterthought, rather than a foundational design principle, resulting in reactive, siloed efforts that fail to address the interdependence of technical integrity and societal trust. To overcome this, we present an integrated research agenda that bridges technical rigor with social responsibility. Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy. The accompanying repository is available at https://github.com/ZTianle/Awesome-AI-SG.
Related papers
- Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z) - AI Deception: Risks, Dynamics, and Controls [153.71048309527225]
This project provides a comprehensive and up-to-date overview of the AI deception field.<n>We identify a formal definition of AI deception, grounded in signaling theory from studies of animal deception.<n>We organize the landscape of AI deception research as a deception cycle, consisting of two key components: deception emergence and deception treatment.
arXiv Detail & Related papers (2025-11-27T16:56:04Z) - CIA+TA Risk Assessment for AI Reasoning Vulnerabilities [0.0]
We present a framework for cognitive cybersecurity, a systematic protection of AI reasoning processes from adversarial manipulation.<n>First, we establish cognitive cybersecurity as a discipline complementing traditional cybersecurity and AI safety.<n>Second, we introduce the CIA+TA, extending traditional Confidentiality, Integrity, and Availability with Trust.<n>Third, we present a quantitative risk assessment methodology with empirically-derived coefficients, enabling organizations to measure cognitive security risks.
arXiv Detail & Related papers (2025-08-19T13:56:09Z) - A Framework for the Assurance of AI-Enabled Systems [0.0]
This paper proposes a claims-based framework for risk management and assurance of AI systems.<n>The paper's contributions are a framework process for AI assurance, a set of relevant definitions, and a discussion of important considerations in AI assurance.
arXiv Detail & Related papers (2025-04-03T13:44:01Z) - Decoding the Black Box: Integrating Moral Imagination with Technical AI Governance [0.0]
We develop a comprehensive framework designed to regulate AI technologies deployed in high-stakes domains such as defense, finance, healthcare, and education.<n>Our approach combines rigorous technical analysis, quantitative risk assessment, and normative evaluation to expose systemic vulnerabilities.
arXiv Detail & Related papers (2025-03-09T03:11:32Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - Responsible Artificial Intelligence Systems: A Roadmap to Society's Trust through Trustworthy AI, Auditability, Accountability, and Governance [37.10526074040908]
This paper explores the concept of a responsible AI system from a holistic perspective.<n>The final goal of the paper is to propose a roadmap in the design of responsible AI systems.
arXiv Detail & Related papers (2025-02-04T14:47:30Z) - Position: Mind the Gap-the Growing Disconnect Between Established Vulnerability Disclosure and AI Security [56.219994752894294]
We argue that adapting existing processes for AI security reporting is doomed to fail due to fundamental shortcomings for the distinctive characteristics of AI systems.<n>Based on our proposal to address these shortcomings, we discuss an approach to AI security reporting and how the new AI paradigm, AI agents, will further reinforce the need for specialized AI security incident reporting advancements.
arXiv Detail & Related papers (2024-12-19T13:50:26Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - Quantifying AI Vulnerabilities: A Synthesis of Complexity, Dynamical Systems, and Game Theory [0.0]
We propose a novel approach that introduces three metrics: System Complexity Index (SCI), Lyapunov Exponent for AI Stability (LEAIS), and Nash Equilibrium Robustness (NER)
SCI quantifies the inherent complexity of an AI system, LEAIS captures its stability and sensitivity to perturbations, and NER evaluates its strategic robustness against adversarial manipulation.
arXiv Detail & Related papers (2024-04-07T07:05:59Z) - Trustworthy AI: From Principles to Practices [44.67324097900778]
Many current AI systems were found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection, etc.
In this review, we strive to provide AI practitioners a comprehensive guide towards building trustworthy AI systems.
To unify the current fragmented approaches towards trustworthy AI, we propose a systematic approach that considers the entire lifecycle of AI systems.
arXiv Detail & Related papers (2021-10-04T03:20:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.