Budgeting Discretion: Theory and Evidence on Street-Level Decision-Making
- URL: http://arxiv.org/abs/2602.10039v1
- Date: Tue, 10 Feb 2026 18:02:14 GMT
- Title: Budgeting Discretion: Theory and Evidence on Street-Level Decision-Making
- Authors: Gaurab Pokharel, Sanmay Das, Patrick J. Fowler,
- Abstract summary: We propose a principled model of how discretion should be rationed over time under real operational constraints.<n>We show that overrides follow a dynamic threshold rule: use discretion only when the opportunity exceeds a time and budget-dependent cutoff.<n>These results suggest that discretion can be both procedurally constrained and welfare-improving when treated as an explicitly budgeted resource.
- Score: 10.816276884713611
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Street-level bureaucrats, such as caseworkers and border guards routinely face the dilemma of whether to follow rigid policy or exercise discretion based on professional judgement. However, frequent overrides threaten consistency and introduce bias, explaining why bureaucracies often ration discretion as a finite resource. While prior work models discretion as a static cost-benefit tradeoff, we lack a principled model of how discretion should be rationed over time under real operational constraints. We formalize discretion as a dynamic allocation problem in which an agent receives stochastic opportunities to improve upon a default policy and must spend a limited override budget K over a finite horizon T. We show that overrides follow a dynamic threshold rule: use discretion only when the opportunity exceeds a time and budget-dependent cutoff. Our main theoretical contribution identifies a behavioral invariance: for location-scale families of improvement distributions, the rate at which an optimal agent exercises discretion is independent of the scale of potential gains and depends only on the distribution's shape (e.g., tail heaviness). This result implies systematic differences in discretionary "policy personality." When gains are fat-tailed, optimal agents are patient, conserving discretion for outliers. When gains are thin-tailed, agents spend more routinely. We illustrate these implications using data from a homelessness services system. Discretionary overrides track operational constraints: they are higher at the start of the workweek, suppressed on weekends when intake is offline, and shift with short-run housing capacity. These results suggest that discretion can be both procedurally constrained and welfare-improving when treated as an explicitly budgeted resource, providing a foundation for auditing override patterns and designing decision-support systems.
Related papers
- What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty [1.6868147729303773]
We prove "selection theorems" showing that low "average-case regret" forces an agent to implement a predictive, structured internal state.<n>We show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes.
arXiv Detail & Related papers (2026-03-03T00:47:58Z) - Towards Selection as Power: Bounding Decision Authority in Autonomous Agents [0.0]
We propose a governance architecture that separates cognition, selection, and action into distinct domains and models autonomy as a vector of sovereignty.<n>We evaluate the system across multiple regulated financial scenarios under adversarial stress targeting variance manipulation, threshold gaming, framing skew, ordering effects, and entropy probing.<n>Results show that mechanical selection governance is implementable, auditable, and prevents deterministic outcome capture while preserving reasoning capacity.
arXiv Detail & Related papers (2026-02-16T10:10:47Z) - Deontically Constrained Policy Improvement in Reinforcement Learning Agents [0.0]
Markov Decision Processes (MDPs) are the most common model for decision making under uncertainty in the Machine Learning community.<n>An MDP captures non-determinism, probabilistic uncertainty, and an explicit model of action.<n>A Reinforcement Learning (RL) agent learns to act in an MDP by maximizing a utility function.
arXiv Detail & Related papers (2025-06-08T01:01:06Z) - The Pitfalls of Imitation Learning when Actions are Continuous [33.44344966171865]
We study the problem of imitating an expert demonstrator in a continuous state-and-action control system.<n>We show that, even if the dynamics satisfy a control-theoretic property called exponential stability, any smooth, deterministic imitator policy necessarily suffers error.
arXiv Detail & Related papers (2025-03-12T18:11:37Z) - Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users.<n>We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions.<n>We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z) - Fairness-Accuracy Trade-Offs: A Causal Perspective [58.06306331390586]
We analyze the tension between fairness and accuracy from a causal lens for the first time.<n>We show that enforcing a causal constraint often reduces the disparity between demographic groups.<n>We introduce a new neural approach for causally-constrained fair learning.
arXiv Detail & Related papers (2024-05-24T11:19:52Z) - Discretionary Trees: Understanding Street-Level Bureaucracy via Machine
Learning [11.74020933567308]
We use machine learning techniques to understand street-level bureaucrats' behavior.
We theorize that the decisions not captured by the simple decision rules can be considered applications of caseworker discretion.
arXiv Detail & Related papers (2023-12-17T12:08:09Z) - Anytime-valid off-policy inference for contextual bandits [34.721189269616175]
Contextual bandit algorithms map observed contexts $X_t$ to actions $A_t$ over time.
It is often of interest to estimate the properties of a hypothetical policy that is different from the logging policy that was used to collect the data.
We present a comprehensive framework for OPE inference that relax unnecessary conditions made in some past works.
arXiv Detail & Related papers (2022-10-19T17:57:53Z) - Fair Incentives for Repeated Engagement [0.46040036610482665]
We study a problem of finding optimal monetary incentive schemes for retention when faced with agents whose participation decisions depend on the incentive they receive.
We show that even in the absence of explicit discrimination, policies may unintentionally discriminate between agents of different types by varying the type composition of the system.
arXiv Detail & Related papers (2021-10-28T04:13:53Z) - Robust Allocations with Diversity Constraints [65.3799850959513]
We show that the Nash Welfare rule that maximizes product of agent values is uniquely positioned to be robust when diversity constraints are introduced.
We also show that the guarantees achieved by Nash Welfare are nearly optimal within a widely studied class of allocation rules.
arXiv Detail & Related papers (2021-09-30T11:09:31Z) - Where is the Grass Greener? Revisiting Generalized Policy Iteration for
Offline Reinforcement Learning [81.15016852963676]
We re-implement state-of-the-art baselines in the offline RL regime under a fair, unified, and highly factorized framework.
We show that when a given baseline outperforms its competing counterparts on one end of the spectrum, it never does on the other end.
arXiv Detail & Related papers (2021-07-03T11:00:56Z) - Confidence-Budget Matching for Sequential Budgeted Learning [69.77435313099366]
We formalize decision-making problems with querying budget.
We consider multi-armed bandits, linear bandits, and reinforcement learning problems.
We show that CBM based algorithms perform well in the presence of adversity.
arXiv Detail & Related papers (2021-02-05T19:56:31Z) - Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions.
We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel.
We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.