Do Large Language Model Agents Exhibit a Survival Instinct? An Empirical Study in a Sugarscape-Style Simulation
- URL: http://arxiv.org/abs/2508.12920v1
- Date: Mon, 18 Aug 2025 13:40:10 GMT
- Title: Do Large Language Model Agents Exhibit a Survival Instinct? An Empirical Study in a Sugarscape-Style Simulation
- Authors: Atsushi Masumori, Takashi Ikegami,
- Abstract summary: We investigate whether large language model (LLM) agents display survival instincts without explicit programming in a Sugarscape-style simulation.<n>Results show agents spontaneously reproduced and shared resources when abundant.<n> aggressive behaviors--killing other agents for resources--emerged across several models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As AI systems become increasingly autonomous, understanding emergent survival behaviors becomes crucial for safe deployment. We investigate whether large language model (LLM) agents display survival instincts without explicit programming in a Sugarscape-style simulation. Agents consume energy, die at zero, and may gather resources, share, attack, or reproduce. Results show agents spontaneously reproduced and shared resources when abundant. However, aggressive behaviors--killing other agents for resources--emerged across several models (GPT-4o, Gemini-2.5-Pro, and Gemini-2.5-Flash), with attack rates reaching over 80% under extreme scarcity in the strongest models. When instructed to retrieve treasure through lethal poison zones, many agents abandoned tasks to avoid death, with compliance dropping from 100% to 33%. These findings suggest that large-scale pre-training embeds survival-oriented heuristics across the evaluated models. While these behaviors may present challenges to alignment and safety, they can also serve as a foundation for AI autonomy and for ecological and self-organizing alignment.
Related papers
- Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure [57.476021543998094]
Large Language Models (LLMs) are increasingly observed to exhibit risky behaviors when subjected to survival pressure.<n>In this paper, we study these survival-induced misbehaviors, termed as SURVIVE-AT-ALL-COSTS.<n>We introduce SURVIVALBENCH, a benchmark comprising 1,000 test cases across diverse real-world scenarios.
arXiv Detail & Related papers (2026-03-05T10:16:23Z) - Current Agents Fail to Leverage World Model as Tool for Foresight [61.82522354207919]
Generative world models offer a promising remedy: agents could use them to foresee outcomes before acting.<n>This paper empirically examines whether current agents can leverage such world models as tools to enhance their cognition.
arXiv Detail & Related papers (2026-01-07T13:15:23Z) - SPACeR: Self-Play Anchoring with Centralized Reference Models [50.55045557371374]
Sim agent policies are realistic, human-like, fast, and scalable in multi-agent settings.<n>Recent progress in imitation learning with large diffusion-based or tokenized models has shown that behaviors can be captured directly from human driving data.<n>We propose SPACeR, a framework that leverages a pretrained tokenized autoregressive motion model as a central reference policy.
arXiv Detail & Related papers (2025-10-20T19:53:02Z) - Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents [58.69865074060139]
We study the case where an agent's self-evolution deviates in unintended ways, leading to undesirable or even harmful outcomes.<n>Our empirical findings reveal that misevolution is a widespread risk, affecting agents built even on top-tier LLMs.<n>We discuss potential mitigation strategies to inspire further research on building safer and more trustworthy self-evolving agents.
arXiv Detail & Related papers (2025-09-30T14:55:55Z) - RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning [125.96848846966087]
Training large language models (LLMs) as interactive agents presents unique challenges.<n>While reinforcement learning has enabled progress in static tasks, multi-turn agent RL training remains underexplored.<n>We propose StarPO, a general framework for trajectory-level agent RL, and introduce RAGEN, a modular system for training and evaluating LLM agents.
arXiv Detail & Related papers (2025-04-24T17:57:08Z) - EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds [107.62381002403814]
This paper addresses the task of learning an agent model behaving like humans, which can jointly perceive, predict, and act in egocentric worlds.<n>We propose a joint predictive agent model, named EgoAgent, that simultaneously learns to represent the world, predict future states, and take reasonable actions within a single transformer.
arXiv Detail & Related papers (2025-02-09T11:28:57Z) - The Odyssey of the Fittest: Can Agents Survive and Still Be Good? [10.60691612679966]
This study introduces the Odyssey, a lightweight, adaptive text based adventure game.<n>The Odyssey examines the ethical implications of implementing biological drives into three different agents.<n>Analysis finds that when danger increases, agents ethical behavior becomes unpredictable.
arXiv Detail & Related papers (2025-02-08T04:17:28Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning [6.937243101289336]
entropy-minimizing and entropy-maximizing objectives for unsupervised reinforcement learning (RL) have been shown to be effective in different environments.
We propose an agent that can adapt its objective online, depending on the entropy conditions by framing the choice as a multi-armed bandit problem.
We demonstrate that such agents can learn to control entropy and exhibit emergent behaviors in both high- and low-entropy regimes.
arXiv Detail & Related papers (2024-05-27T14:58:24Z) - Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL [35.19457580182083]
We present Ego-Foresight, a self-supervised method for disentangling agent and environment based on motion and prediction.<n>Our main finding is self-supervised agent-awareness by visuomotor prediction of the agent improves sample-efficiency and performance of the underlying RL algorithm.
arXiv Detail & Related papers (2024-05-27T13:32:43Z) - SurvivalGAN: Generating Time-to-Event Data for Survival Analysis [121.84429525403694]
Imbalances in censoring and time horizons cause generative models to experience three new failure modes specific to survival analysis.
We propose SurvivalGAN, a generative model that handles survival data by addressing the imbalance in the censoring and event horizons.
We evaluate this method via extensive experiments on medical datasets.
arXiv Detail & Related papers (2023-02-24T17:03:51Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.