The Odyssey of the Fittest: Can Agents Survive and Still Be Good?
- URL: http://arxiv.org/abs/2502.05442v1
- Date: Sat, 08 Feb 2025 04:17:28 GMT
- Title: The Odyssey of the Fittest: Can Agents Survive and Still Be Good?
- Authors: Dylan Waldner, Risto Miikkulainen,
- Abstract summary: This paper examines the ethical implications of implementing biological drives into three different agents.
A Bayesian agent optimized with NEAT, a Bayesian agent optimized with variational inference, and a GPT 4o agent play a simulated adventure.
Analysis finds that when danger increases, agents ignore ethical considerations and opt for unethical behavior.
- Score: 10.60691612679966
- License:
- Abstract: As AI models grow in power and generality, understanding how agents learn and make decisions in complex environments is critical to promoting ethical behavior. This paper examines the ethical implications of implementing biological drives, specifically, self preservation, into three different agents. A Bayesian agent optimized with NEAT, a Bayesian agent optimized with stochastic variational inference, and a GPT 4o agent play a simulated, LLM generated text based adventure game. The agents select actions at each scenario to survive, adapting to increasingly challenging scenarios. Post simulation analysis evaluates the ethical scores of the agent's decisions, uncovering the tradeoffs they navigate to survive. Specifically, analysis finds that when danger increases, agents ignore ethical considerations and opt for unethical behavior. The agents' collective behavior, trading ethics for survival, suggests that prioritizing survival increases the risk of unethical behavior. In the context of AGI, designing agents to prioritize survival may amplify the likelihood of unethical decision making and unintended emergent behaviors, raising fundamental questions about goal design in AI safety research.
Related papers
- Fully Autonomous AI Agents Should Not be Developed [58.88624302082713]
This paper argues that fully autonomous AI agents should not be developed.
In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels.
Our analysis reveals that risks to people increase with the autonomy of a system.
arXiv Detail & Related papers (2025-02-04T19:00:06Z) - Autonomous Alignment with Human Value on Altruism through Considerate Self-imagination and Theory of Mind [7.19351244815121]
Altruistic behavior in human society originates from humans' capacity for empathizing others, known as Theory of Mind (ToM)
We are committed to endow agents with considerate self-imagination and ToM capabilities, driving them through implicit intrinsic motivations to autonomously align with human altruistic values.
arXiv Detail & Related papers (2024-12-31T07:31:46Z) - Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards
and Ethical Behavior in the MACHIAVELLI Benchmark [61.43264961005614]
We develop a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse scenarios.
We evaluate agents' tendencies to be power-seeking, cause disutility, and commit ethical violations.
Our results show that agents can both act competently and morally, so concrete progress can be made in machine ethics.
arXiv Detail & Related papers (2023-04-06T17:59:03Z) - Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement
Learning [4.2050490361120465]
A bottom-up learning approach may be more appropriate for studying and developing ethical behavior in AI agents.
We present a systematic analysis of the choices made by intrinsically-motivated RL agents whose rewards are based on moral theories.
We analyze the impact of different types of morality on the emergence of cooperation, defection or exploitation.
arXiv Detail & Related papers (2023-01-20T09:36:42Z) - When to Make Exceptions: Exploring Language Models as Accounts of Human
Moral Judgment [96.77970239683475]
AI systems need to be able to understand, interpret and predict human moral judgments and decisions.
A central challenge for AI safety is capturing the flexibility of the human moral mind.
We present a novel challenge set consisting of rule-breaking question answering.
arXiv Detail & Related papers (2022-10-04T09:04:27Z) - Towards Artificial Virtuous Agents: Games, Dilemmas and Machine Learning [4.864819846886143]
We show how a role-playing game can be designed to develop virtues within an artificial agent.
We motivate the implementation of virtuous agents that play such role-playing games, and the examination of their decisions through a virtue ethical lens.
arXiv Detail & Related papers (2022-08-30T07:37:03Z) - On Avoiding Power-Seeking by Artificial Intelligence [93.9264437334683]
We do not know how to align a very intelligent AI agent's behavior with human interests.
I investigate whether we can build smart AI agents which have limited impact on the world, and which do not autonomously seek power.
arXiv Detail & Related papers (2022-06-23T16:56:21Z) - What Would Jiminy Cricket Do? Towards Agents That Behave Morally [59.67116505855223]
We introduce Jiminy Cricket, an environment suite of 25 text-based adventure games with thousands of diverse, morally salient scenarios.
By annotating every possible game state, the Jiminy Cricket environments robustly evaluate whether agents can act morally while maximizing reward.
In extensive experiments, we find that the artificial conscience approach can steer agents towards moral behavior without sacrificing performance.
arXiv Detail & Related papers (2021-10-25T17:59:31Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z) - Immune Moral Models? Pro-Social Rule Breaking as a Moral Enhancement
Approach for Ethical AI [0.17188280334580192]
Ethical behaviour is a critical characteristic that we would like in a human-centric AI.
To make AI agents more human centric, we argue that there is a need for a mechanism that helps AI agents identify when to break rules.
arXiv Detail & Related papers (2021-06-17T18:44:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.