ArGen: Auto-Regulation of Generative AI via GRPO and Policy-as-Code
- URL: http://arxiv.org/abs/2509.07006v1
- Date: Sat, 06 Sep 2025 04:33:16 GMT
- Title: ArGen: Auto-Regulation of Generative AI via GRPO and Policy-as-Code
- Authors: Kapil Madan,
- Abstract summary: ArGen is a framework for aligning Large Language Models with complex rules spanning ethical principles, operational safety protocols, and regulatory compliance standards.<n>We show that ArGen's methodology offers a path to 'Governable Al' systems that are technically proficient, ethically robust, and verifiably compliant for safe deployment in diverse global contexts.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces ArGen (Auto-Regulation of Generative AI systems), a framework for aligning Large Language Models (LLMs) with complex sets of configurable, machine-readable rules spanning ethical principles, operational safety protocols, and regulatory compliance standards. Moving beyond just preference-based alignment, ArGen is designed to ensure LLMs adhere to these multifaceted policies through a novel synthesis of principle-based automated reward scoring, Group Relative Policy Optimisation (GRPO), and an Open Policy Agent (OPA) inspired governance layer. This approach provides the technical foundation for achieving and demonstrating compliance with diverse and nuanced governance requirements. To showcase the framework's capability to operationalize a deeply nuanced and culturally-specific value system, we present an in-depth case study: the development of a medical AI assistant guided by principles from Dharmic ethics (such as Ahimsa and Dharma), as derived from texts like the Bhagavad Gita. This challenging application demonstrates ArGen's adaptability, achieving a 70.9% improvement in domain-scope adherence over the baseline. Through our open-source repository, we show that ArGen's methodology offers a path to 'Governable Al' systems that are technically proficient, ethically robust, and verifiably compliant for safe deployment in diverse global contexts.
Related papers
- Executable Governance for AI: Translating Policies into Rules Using LLMs [1.388831902854619]
Policy-to-Tests (P2T) is a framework that converts natural policy documents into normalized, machine-readable rules.<n>To test the framework beyond a single policy, we apply it across general frameworks, sector guidance, and enterprise standards.<n>These AI-generated rules closely match strong human baselines on span-level and rule-level metrics, with robust inter-annotator agreement on the gold set.
arXiv Detail & Related papers (2025-12-04T03:11:54Z) - Domain-Specific Data Generation Framework for RAG Adaptation [58.20906914537952]
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models with external retrieval to enable domain-grounded responses.<n>We propose RAGen, a framework for generating domain-grounded question-answer-context (QAC) triples tailored to diverse RAG adaptation approaches.
arXiv Detail & Related papers (2025-10-13T09:59:49Z) - A five-layer framework for AI governance: integrating regulation, standards, and certification [0.6875312133832078]
The governance of artificial iintelligence (AI) systems requires a structured approach that connects high-level regulatory principles with practical implementation.<n>Existing frameworks lack clarity on how regulations translate into conformity mechanisms, leading to gaps in compliance and enforcement.<n>A five-layer AI governance framework is proposed, spanning from broad regulatory mandates to specific standards, assessment methodologies, and certification processes.
arXiv Detail & Related papers (2025-09-14T16:19:08Z) - ARPaCCino: An Agentic-RAG for Policy as Code Compliance [0.18472148461613155]
ARPaCCino is an agentic system that combines Large Language Models, Retrieval-Augmented-Generation, and tool-based validation.<n>It generates formal Rego rules, assesses IaC compliance, and iteratively refines the IaC configurations to ensure conformance.<n>Our results highlight the potential of agentic RAG architectures to enhance the automation, reliability, and accessibility of PaC.
arXiv Detail & Related papers (2025-07-11T12:36:33Z) - Action Dependency Graphs for Globally Optimal Coordinated Reinforcement Learning [0.0]
Action-dependent individual policies have emerged as a promising paradigm for achieving global optimality in multi-agent reinforcement learning.<n>In this work, we consider a more generalized class of action-dependent policies, which do not necessarily follow the auto-regressive form.<n>Within the context of MARL problems structured by coordination graphs, we prove that an action-dependent policy with a sparse ADG can achieve global optimality.
arXiv Detail & Related papers (2025-06-01T02:58:20Z) - MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR [59.83547898874152]
We introduce a sample-efficient, two-stage adaptation approach that integrates self-supervised learning with semi-supervised techniques.<n>MSDA is designed to enhance the robustness and generalization of ASR models.<n>We demonstrate that Meta PL can be applied effectively to ASR tasks, achieving state-of-the-art results.
arXiv Detail & Related papers (2025-05-30T14:46:05Z) - Enterprise Architecture as a Dynamic Capability for Scalable and Sustainable Generative AI adoption: Bridging Innovation and Governance in Large Organisations [55.2480439325792]
Generative Artificial Intelligence is a powerful new technology with the potential to boost innovation and reshape governance in many industries.<n>However, organisations face major challenges in scaling GenAI, including technology complexity, governance gaps and resource misalignments.<n>This study explores how Enterprise Architecture Management can meet the complex requirements of GenAI adoption within large enterprises.
arXiv Detail & Related papers (2025-05-09T07:41:33Z) - Approaches to Responsible Governance of GenAI in Organizations [0.1747623282473278]
This paper draws on literature, established governance frameworks, and industry roundtable discussions to identify core principles for integrating responsible GenAI governance into diverse organizational structures.<n>Findings emphasize the need for adaptable risk assessment tools, continuous monitoring practices, and cross-sector collaboration to establish trustworthy GenAI.
arXiv Detail & Related papers (2025-04-23T18:43:29Z) - Standardizing Intelligence: Aligning Generative AI for Regulatory and Operational Compliance [3.666326242924816]
We assess the criticality levels of different standards across domains and sectors and complement them by grading the current compliance capabilities of state-of-the-art GenAI models.<n>Overall, we argue that aligning GenAI with standards through computational methods can help strengthen regulatory and operational compliance.
arXiv Detail & Related papers (2025-02-03T16:55:01Z) - PRACT: Optimizing Principled Reasoning and Acting of LLM Agent [96.10771520261596]
We introduce the Principled Reasoning and Acting (PRAct) framework, a novel method for learning and enforcing action principles from trajectory data.
We propose a new optimization framework, Reflective Principle Optimization (RPO), to adapt action principles to specific task requirements.
Experimental results across four environments demonstrate that the PRAct agent, leveraging the RPO framework, effectively learns and applies action principles to enhance performance.
arXiv Detail & Related papers (2024-10-24T08:21:51Z) - Levels of AGI for Operationalizing Progress on the Path to AGI [64.59151650272477]
We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors.
This framework introduces levels of AGI performance, generality, and autonomy, providing a common language to compare models, assess risks, and measure progress along the path to AGI.
arXiv Detail & Related papers (2023-11-04T17:44:58Z) - PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback [106.63518036538163]
We present a novel unified bilevel optimization-based framework, textsfPARL, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning.
Our framework addressed these concerns by explicitly parameterizing the distribution of the upper alignment objective (reward design) by the lower optimal variable.
Our empirical results substantiate that the proposed textsfPARL can address the alignment concerns in RL by showing significant improvements.
arXiv Detail & Related papers (2023-08-03T18:03:44Z) - Option-Aware Adversarial Inverse Reinforcement Learning for Robotic
Control [44.77500987121531]
Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizon tasks from expert demonstrations.
We develop a novel HIL algorithm based on Adversarial Inverse Reinforcement Learning.
We also propose a Variational Autoencoder framework for learning with our objectives in an end-to-end fashion.
arXiv Detail & Related papers (2022-10-05T00:28:26Z) - Interpretable Reinforcement Learning with Multilevel Subgoal Discovery [77.34726150561087]
We propose a novel Reinforcement Learning model for discrete environments.
In the model, an agent learns information about environment in the form of probabilistic rules.
No reward function is required for learning; an agent only needs to be given a primary goal to achieve.
arXiv Detail & Related papers (2022-02-15T14:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.