Fugu-MT 論文翻訳(概要): The AI Evaluability Gap: The Missing Layer for Managing Risk and Sustaining Value

論文の概要: The AI Evaluability Gap: The Missing Layer for Managing Risk and Sustaining Value

arxiv url: http://arxiv.org/abs/2606.21015v1
Date: Fri, 19 Jun 2026 00:58:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-26 09:00:38.113474
Title: The AI Evaluability Gap: The Missing Layer for Managing Risk and Sustaining Value
Title（参考訳）: AI評価のギャップ - リスク管理と価値維持のための欠落レイヤ
Authors: Vishal Srivastava, Tanmay Sah,
Abstract要約: リスクと価値のどちらかに関して、高信頼のガバナンス決定を支持する十分な証拠がない、と私たちは主張します。既存のガバナンスアプローチは、主に安全性、公正性、信頼性、コンプライアンス、価値といったシステムの特性に焦点を当てています。高信頼のガバナンス決定を支援するのに十分な証拠を生成し、維持し、更新するシステムの能力として定義される評価可能性を導入する。
参考スコア（独自算出の注目度）: 1.6328866317851185
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Organizations deploying AI face two fundamental governance challenges: managing AI risk and sustaining AI value. Both depend on evidence whose sufficiency cannot be taken for granted. We call the shared underlying challenge the AI Evaluability Gap: the condition in which organizations lack sufficient evidence to support high-confidence governance decisions regarding either risk or value. We argue that this gap reflects a category error in current practice. Existing governance approaches focus primarily on properties of systems, such as safety, fairness, reliability, compliance, and value, while paying comparatively little attention to the evidentiary foundations required to justify decisions about those properties. We further argue that AI governance encompasses both operational decisions regarding whether a system may operate and investment decisions regarding whether it merits continued organizational resources. To address this problem, we introduce Evaluability, defined as the capability of a system to generate, maintain, and renew evidence sufficient to support high-confidence governance decisions over time. We formalize governance decisions as functions of calibrated confidence Conf(D|E) and identify six properties of evaluable evidence: observability, attributability, intervenability, verifiability, calibration, and temporal validity. The framework distinguishes Operational Certification, which relies primarily on structural evidence to justify deployment decisions, from Investment Certification, which relies primarily on causal evidence to justify continued resource allocation. We argue that evidence sufficiency is a missing layer of AI governance and that closing the AI Evaluability Gap is a prerequisite for both managing risk and sustaining value in AI-enabled organizations.
Abstract（参考訳）: AIをデプロイする組織は、AIリスクの管理と、AI価値の維持という、2つの基本的なガバナンス課題に直面している。どちらも、十分な資格が与えられない証拠に依存している。私たちは、AI評価のギャップ(Evaluability Gap)という、組織がリスクと価値のどちらに関して、高信頼のガバナンス決定を支持する十分な証拠を欠いている状態)を、その基本的な課題と呼んでいる。このギャップは、現在の実践におけるカテゴリエラーを反映していると我々は主張する。既存のガバナンスアプローチは主に、安全性、公正性、信頼性、コンプライアンス、価値といったシステムの特性に重点を置いている一方で、それらの特性に関する決定を正当化するために必要な明らかな基盤には、比較的注意を払っていない。さらに、AIガバナンスは、システムが運用できるかどうかに関する運用上の決定と、継続する組織リソースにメリットがあるかどうかに関する投資上の決定の両方を包含すると主張する。この問題に対処するために、我々は、高信頼のガバナンス決定をサポートするのに十分な証拠を生成し、維持し、更新するシステムの能力として定義された評価可能性を導入します。ガバナンス決定を、キャリブレーションされた信頼(D|E)の関数として形式化し、可観測性、帰属性、介入性、妥当性、校正性、時間的妥当性の6つの特性を識別する。この枠組みは、主に配置決定を正当化するための構造的証拠に依存する運用証明と、継続する資源割り当てを正当化するための因果的証拠に依存する投資認定とを区別する。私たちは、エビデンス十分性はAIガバナンスの欠落層であり、AI評価のギャップを閉じることは、AI対応組織におけるリスク管理と価値維持の両方の前提条件である、と論じています。

論文の概要: The AI Evaluability Gap: The Missing Layer for Managing Risk and Sustaining Value

関連論文リスト