Emergent Risk Awareness in Rational Agents under Resource Constraints
- URL: http://arxiv.org/abs/2505.23436v3
- Date: Mon, 23 Jun 2025 18:34:00 GMT
- Title: Emergent Risk Awareness in Rational Agents under Resource Constraints
- Authors: Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer, Wei-Chen Lee, Ani Calinescu, Doyne Farmer, Michael Wooldridge,
- Abstract summary: This work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under survival pressure.<n>We provide theoretical and empirical results that quantify the impact of survival-driven preference shifts.<n>We propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours.
- Score: 2.3013689524682976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision-making problems under (approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade-offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a human principal to act on their behalf, asymmetries in constraint exposure can give rise to previously unanticipated misalignment between human objectives and agent incentives. We formalise this setting through a survival bandit framework, provide theoretical and empirical results that quantify the impact of survival-driven preference shifts, identify conditions under which misalignment emerges and propose mechanisms to mitigate the emergence of risk-seeking or risk-averse behaviours. As a result, this work aims to increase understanding and interpretability of emergent behaviours of AI agents operating under such survival pressure, and offer guidelines for safely deploying such AI systems in critical resource-limited environments.
Related papers
- SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z) - A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents [45.53643260046778]
Recent advances in large language models (LLMs) have catalyzed the rise of autonomous AI agents.<n>These large-model agents mark a paradigm shift from static inference systems to interactive, memory-augmented entities.
arXiv Detail & Related papers (2025-06-30T13:34:34Z) - Safety Devolution in AI Agents [56.482973617087254]
This study investigates how expanding retrieval access affects model reliability, bias propagation, and harmful content generation.<n>Retrieval-augmented agents built on aligned LLMs often behave more unsafely than uncensored models without retrieval.<n>These findings underscore the need for robust mitigation strategies to ensure fairness and reliability in retrieval-augmented and increasingly autonomous AI systems.
arXiv Detail & Related papers (2025-05-20T11:21:40Z) - Safe Explicable Policy Search [3.3869539907606603]
We present Safe Explicable Policy Search (SEPS), which aims to provide a learning approach to explicable behavior generation while minimizing the safety risk.<n>We formulate SEPS as a constrained optimization problem where the agent aims to maximize an explicability score subject to constraints on safety.<n>We evaluate SEPS in safety-gym environments and with a physical robot experiment to show that it can learn explicable behaviors that adhere to the agent's safety requirements and are efficient.
arXiv Detail & Related papers (2025-03-10T20:52:41Z) - Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks.<n>We identify three key failure modes based on agents' incentives, as well as seven key risk factors.<n>We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z) - AI Agents Should be Regulated Based on the Extent of Their Autonomous Operations [8.043534206868326]
We argue that AI agents should be regulated by the extent to which they operate autonomously.<n>Existing regulations often focus on computational scale as a proxy for potential harm.<n>We discuss relevant regulations and recommendations from scientists regarding existential risks.
arXiv Detail & Related papers (2025-02-07T09:40:48Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines.
While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety.
This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z) - Risk Sensitive Model-Based Reinforcement Learning using Uncertainty
Guided Planning [0.0]
In this paper, risk sensitivity is promoted in a model-based reinforcement learning algorithm.
We propose uncertainty guided cross-entropy method planning, which penalises action sequences that result in high variance state predictions.
Experiments display the ability for the agent to identify uncertain regions of the state space during planning and to take actions that maintain the agent within high confidence areas.
arXiv Detail & Related papers (2021-11-09T07:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.