Measuring What Matters: The AI Pluralism Index
- URL: http://arxiv.org/abs/2510.08193v1
- Date: Thu, 09 Oct 2025 13:19:34 GMT
- Title: Measuring What Matters: The AI Pluralism Index
- Authors: Rashid Mushkani,
- Abstract summary: We present the AI Pluralism Index (AIPI), a transparent, evidence-based instrument that evaluates producers and system families across four pillars: participatory governance, inclusivity and diversity, transparency, and accountability.<n>The index aims to steer incentives toward pluralistic practice and to equip policymakers, procurers, and the public with comparable evidence.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence systems increasingly mediate knowledge, communication, and decision making. Development and governance remain concentrated within a small set of firms and states, raising concerns that technologies may encode narrow interests and limit public agency. Capability benchmarks for language, vision, and coding are common, yet public, auditable measures of pluralistic governance are rare. We define AI pluralism as the degree to which affected stakeholders can shape objectives, data practices, safeguards, and deployment. We present the AI Pluralism Index (AIPI), a transparent, evidence-based instrument that evaluates producers and system families across four pillars: participatory governance, inclusivity and diversity, transparency, and accountability. AIPI codes verifiable practices from public artifacts and independent evaluations, explicitly handling "Unknown" evidence to report both lower-bound ("evidence") and known-only scores with coverage. We formalize the measurement model; implement a reproducible pipeline that integrates structured web and repository analysis, external assessments, and expert interviews; and assess reliability with inter-rater agreement, coverage reporting, cross-index correlations, and sensitivity analysis. The protocol, codebook, scoring scripts, and evidence graph are maintained openly with versioned releases and a public adjudication process. We report pilot provider results and situate AIPI relative to adjacent transparency, safety, and governance frameworks. The index aims to steer incentives toward pluralistic practice and to equip policymakers, procurers, and the public with comparable evidence.
Related papers
- Frontier AI Auditing: Toward Rigorous Third-Party Assessment of Safety and Security Practices at Leading AI Companies [57.521647436515785]
We define frontier AI auditing as rigorous third-party verification of frontier AI developers' safety and security claims.<n>We introduce AI Assurance Levels (AAL-1 to AAL-4), ranging from time-bounded system audits to continuous, deception-resilient verification.
arXiv Detail & Related papers (2026-01-16T18:44:09Z) - Computable Gap Assessment of Artificial Intelligence Governance in Children's Centres: Evidence-Mechanism-Governance-Indicator Modelling of UNICEF's Guidance on AI and Children 3.0 Based on the Graph-GAP Framework [5.260137087369841]
We propose a methodology that decomposes requirements from authoritative policy texts into a four layer graph of evidence, mechanism, governance, and indicator.<n>Using the UNICEF Innocenti Guidance on AI and Children 3.0 as primary material, we define reproducible extraction units, coding manuals, graph patterns, scoring scales, and consistency checks.<n>Results suggest that compared with privacy and data protection, requirements related to child well being and development, explainability and accountability, and cross agency implementation and resource allocation are more prone to indicator gaps and mechanism gaps.
arXiv Detail & Related papers (2025-12-20T17:03:17Z) - CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection [60.52240468810558]
We introduce CoCoNUTS, a content-oriented benchmark built upon a fine-grained dataset of AI-generated peer reviews.<n>We also develop CoCoDet, an AI review detector via a multi-task learning framework, to achieve more accurate and robust detection of AI involvement in review content.
arXiv Detail & Related papers (2025-08-28T06:03:11Z) - Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z) - OpenReview Should be Protected and Leveraged as a Community Asset for Research in the Era of Large Language Models [55.21589313404023]
OpenReview is a continually evolving repository of research papers, peer reviews, author rebuttals, meta-reviews, and decision outcomes.<n>We highlight three promising areas in which OpenReview can uniquely contribute: enhancing the quality, scalability, and accountability of peer review processes; enabling meaningful, open-ended benchmarks rooted in genuine expert deliberation; and supporting alignment research through real-world interactions reflecting expert assessment, intentions, and scientific values.<n>We suggest the community collaboratively explore standardized benchmarks and usage guidelines around OpenReview, inviting broader dialogue on responsible data use, ethical considerations, and collective stewardship.
arXiv Detail & Related papers (2025-05-24T09:07:13Z) - Toward a Public and Secure Generative AI: A Comparative Analysis of Open and Closed LLMs [0.0]
This study aims to critically evaluate and compare the characteristics, opportunities, and challenges of open and closed generative AI models.<n>The proposed framework outlines key dimensions, openness, public governance, and security, as essential pillars for shaping the future of trustworthy and inclusive Gen AI.
arXiv Detail & Related papers (2025-05-15T15:21:09Z) - AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability.<n>The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z) - Bridging the Communication Gap: Evaluating AI Labeling Practices for Trustworthy AI Development [41.64451715899638]
High-level AI labels, inspired by frameworks like EU energy labels, have been proposed to make the properties of AI models more transparent.<n>This study evaluates AI labeling through qualitative interviews along four key research questions.
arXiv Detail & Related papers (2025-01-21T06:00:14Z) - Who is Responsible? The Data, Models, Users or Regulations? A Comprehensive Survey on Responsible Generative AI for a Sustainable Future [17.980029412706106]
Generative AI is moving rapidly from research into real world deployment across sectors.<n>This study synthesizes the landscape of responsible generative AI across methods, benchmarks, and policies.
arXiv Detail & Related papers (2025-01-15T20:59:42Z) - Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models [1.7466076090043157]
Large Language Models (LLMs) could transform many fields, but their fast development creates significant challenges for oversight, ethical creation, and building user trust.
This comprehensive review looks at key trust issues in LLMs, such as unintended harms, lack of transparency, vulnerability to attacks, alignment with human values, and environmental impact.
To tackle these issues, we suggest combining ethical oversight, industry accountability, regulation, and public involvement.
arXiv Detail & Related papers (2024-06-01T14:47:58Z) - Multisource AI Scorecard Table for System Evaluation [3.74397577716445]
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist.
The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system.
arXiv Detail & Related papers (2021-02-08T03:37:40Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.