Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2508.15652v2
- Date: Sat, 23 Aug 2025 17:31:22 GMT
- Title: Understanding Action Effects through Instrumental Empowerment in Multi-Agent Reinforcement Learning
- Authors: Ardian Selmonaj, Miroslav Strupl, Oleg Szehr, Alessandro Antonucci,
- Abstract summary: This work investigates whether meaningful insights into agent behaviors can be extracted solely by analyzing the policy distribution.<n>Inspired by the phenomenon that intelligent agents tend to pursue convergent instrumental values, we introduce Intended Cooperation Values (ICVs)<n>ICVs measure an agent's action effect on its teammates' policies by assessing their decision (un)certainty and preference alignment.
- Score: 39.74025439412935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To reliably deploy Multi-Agent Reinforcement Learning (MARL) systems, it is crucial to understand individual agent behaviors. While prior work typically evaluates overall team performance based on explicit reward signals, it is unclear how to infer agent contributions in the absence of any value feedback. In this work, we investigate whether meaningful insights into agent behaviors can be extracted solely by analyzing the policy distribution. Inspired by the phenomenon that intelligent agents tend to pursue convergent instrumental values, we introduce Intended Cooperation Values (ICVs), a method based on information-theoretic Shapley values for quantifying each agent's causal influence on their co-players' instrumental empowerment. Specifically, ICVs measure an agent's action effect on its teammates' policies by assessing their decision (un)certainty and preference alignment. By analyzing action effects on policies and value functions across cooperative and competitive MARL tasks, our method identifies which agent behaviors are beneficial to team success, either by fostering deterministic decisions or by preserving flexibility for future action choices, while also revealing the extent to which agents adopt similar or diverse strategies. Our proposed method offers novel insights into cooperation dynamics and enhances explainability in MARL systems.
Related papers
- Counterfactual-based Agent Influence Ranker for Agentic AI Workflows [4.971684462894703]
An Agentic AI (AAW) assembles several LLM-based agents to work collaboratively towards a shared goal.<n>There are no existing methods to assess the influence of each agent on the AAW's final output.<n>We present Counterfactual-based Agent Influence Ranker (CAIR) - the first method for assessing the influence level of each agent on the AAW's output.
arXiv Detail & Related papers (2025-10-29T15:17:31Z) - Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues [16.07828032939124]
This paper presents an evaluation framework for agentic AI systems in mission-critical negotiation contexts.<n>Using Sotopia as a simulation testbed, we present two experiments that systematically evaluated how personality traits and AI agent characteristics influence social negotiation outcomes.
arXiv Detail & Related papers (2025-06-19T00:14:56Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions [12.074590482085831]
We seek to codify factors governing safe multi-agent interactions via the lens of responsibility.
We propose a data-driven modeling approach based on control barrier functions and differentiable optimization.
arXiv Detail & Related papers (2024-10-09T20:20:41Z) - Situation-Dependent Causal Influence-Based Cooperative Multi-agent
Reinforcement Learning [18.054709749075194]
We propose a novel MARL algorithm named Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning (SCIC)
Our approach aims to detect inter-agent causal influences in specific situations based on the criterion using causal intervention and conditional mutual information.
The resulting update links coordinated exploration and intrinsic reward distribution, which enhance overall collaboration and performance.
arXiv Detail & Related papers (2023-12-15T05:09:32Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs [13.524274041966539]
We introduce agent-specific effects (ASE), a novel causal quantity that measures the effect of an agent's action on the outcome that propagates through other agents.
We experimentally evaluate the utility of cf-ASE through a simulation-based testbed, which includes a sepsis management environment.
arXiv Detail & Related papers (2023-10-17T15:12:56Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Modeling the Interaction between Agents in Cooperative Multi-Agent
Reinforcement Learning [2.9360071145551068]
We propose a novel cooperative MARL algorithm named as interactive actor-critic(IAC)
IAC models the interaction of agents from perspectives of policy and value function.
We extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments.
arXiv Detail & Related papers (2021-02-10T01:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.