Related papers: Admissibility Alignment

Admissibility Alignment

URL: http://arxiv.org/abs/2601.01816v1
Date: Mon, 05 Jan 2026 05:58:19 GMT
Title: Admissibility Alignment
Authors: Chris Duffey,
Abstract summary: We present MAP-AI, a new control-plane system architecture for aligned decision-making under uncertainty.<n>It enforces alignment through Monte Carlo estimation of outcome distributions and admissibility-controlled policy selection.<n>We show how alignment evaluation can be integrated into decision-making itself, yielding an admissibility-controlled action selection mechanism.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces Admissibility Alignment: a reframing of AI alignment as a property of admissible action and decision selection over distributions of outcomes under uncertainty, evaluated through the behavior of candidate policies. We present MAP-AI (Monte Carlo Alignment for Policy) as a canonical system architecture for operationalizing admissibility alignment, formalizing alignment as a probabilistic, decision-theoretic property rather than a static or binary condition. MAP-AI, a new control-plane system architecture for aligned decision-making under uncertainty, enforces alignment through Monte Carlo estimation of outcome distributions and admissibility-controlled policy selection rather than static model-level constraints. The framework evaluates decision policies across ensembles of plausible futures, explicitly modeling uncertainty, intervention effects, value ambiguity, and governance constraints. Alignment is assessed through distributional properties including expected utility, variance, tail risk, and probability of misalignment rather than accuracy or ranking performance. This approach distinguishes probabilistic prediction from decision reasoning under uncertainty and provides an executable methodology for evaluating trust and alignment in enterprise and institutional AI systems. The result is a practical foundation for governing AI systems whose impact is determined not by individual forecasts, but by policy behavior across distributions and tail events. Finally, we show how distributional alignment evaluation can be integrated into decision-making itself, yielding an admissibility-controlled action selection mechanism that alters policy behavior under uncertainty without retraining or modifying underlying models.

Related papers

Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes [59.27926064817273]
We introduce an exploration-agnostic algorithm, called C-PG, which enjoys global last-iterate convergence guarantees under domination assumptions.<n>We empirically validate both the action-based (C-PGAE) and parameter-based (C-PGPE) variants of C-PG on constrained control tasks.
arXiv Detail & Related papers (2025-06-06T10:29:05Z)
Marginal Fairness: Fair Decision-Making under Risk Measures [24.99817090886293]
This paper introduces marginal fairness, a new individual fairness notion for equitable decision-making in the presence of protected attributes.<n>We model business decision-making in highly regulated industries (such as insurance and finance) as a two-step process.<n>A numerical study and an empirical implementation using an auto insurance dataset demonstrate how the framework can be applied in practice.
arXiv Detail & Related papers (2025-05-24T22:44:35Z)
Conformalized Decision Risk Assessment [5.391713612899277]
We introduce CREDO, a novel framework that quantifies for any candidate decision, a distribution-free upper bound on the probability that the decision is suboptimal.<n>By combining inverse optimization geometry with conformal prediction and generative modeling, CREDO produces risk certificates that are both statistically rigorous and practically interpretable.
arXiv Detail & Related papers (2025-05-19T15:24:38Z)
Uncertainty Quantification and Causal Considerations for Off-Policy Decision Making [4.514386953429771]
Off-policy evaluation (OPE) seeks to assess the performance of a new policy using data collected under a different policy.<n>Existing OPE methodologies suffer from several limitations arising from statistical uncertainty as well as causal considerations.<n>We introduce the Marginal Ratio (MR) estimator, a novel OPE method that reduces variance by focusing on the marginal distribution of outcomes.<n>Next, we propose Conformal Off-Policy Prediction (COPP), a principled approach for uncertainty quantification in OPE.<n>Finally, we address causal unidentifiability in off-policy decision-making by developing novel bounds for sequential decision settings
arXiv Detail & Related papers (2025-02-09T20:05:19Z)
Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.<n>We propose methods tailored to the unique properties of perception and decision-making.<n>We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z)
Predictive Performance Comparison of Decision Policies Under Confounding [32.21041697921289]
We propose a method to compare the predictive performance of decision policies under a variety of modern identification approaches. Key to our method is the insight that there are regions of uncertainty that we can safely ignore in the policy comparison.
arXiv Detail & Related papers (2024-04-01T01:27:07Z)
Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response. We construct unbiased estimators for the policy-dependent estimand by a perturbation method. We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z)
An Offline Risk-aware Policy Selection Method for Bayesian Markov Decision Processes [0.0]
Exploitation vs Caution (EvC) is a paradigm that elegantly incorporates model uncertainty abiding by the Bayesian formalism. We validate EvC with state-of-the-art approaches in different discrete, yet simple, environments offering a fair variety of MDP classes. In the tested scenarios EvC manages to select robust policies and hence stands out as a useful tool for practitioners.
arXiv Detail & Related papers (2021-05-27T20:12:20Z)
Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces. The proposed method explores local properties of policy behavior to identify unexpected decisions. We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z)
Offline Policy Selection under Uncertainty [113.57441913299868]
We consider offline policy selection as learning preferences over a set of policy prospects given a fixed experience dataset. Access to the full distribution over one's belief of the policy value enables more flexible selection algorithms under a wider range of downstream evaluation metrics. We show how BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric.
arXiv Detail & Related papers (2020-12-12T23:09:21Z)
Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy. We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.