Related papers: Concept Learning for Interpretable Multi-Agent Reinforcement Learning

Concept Learning for Interpretable Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2302.12232v1
Date: Thu, 23 Feb 2023 18:53:09 GMT
Title: Concept Learning for Interpretable Multi-Agent Reinforcement Learning
Authors: Renos Zabounidis, Joseph Campbell, Simon Stepputtis, Dana Hughes, Katia Sycara
Abstract summary: We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning. This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance. We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game.
Score: 5.179808182296037
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent robotic systems are increasingly operating in real-world environments in close proximity to humans, yet are largely controlled by policy models with inscrutable deep neural network representations. We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning, by requiring the model to first predict such concepts then utilize them for decision making. This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance. We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game.

Related papers

Platonic Grounding for Efficient Multimodal Language Models [22.715168904364756]
We motivate and propose a simple modification to existing multimodal frameworks that rely on aligning pretrained models. Our work also has implications for combining pretrained models into larger systems efficiently.
arXiv Detail & Related papers (2025-04-27T18:56:26Z)
MORAL: A Multimodal Reinforcement Learning Framework for Decision Making in Autonomous Laboratories [4.503215272392276]
We propose MORAL (a multimodal reinforcement learning framework for decision making in autonomous laboratories) We generate fine-tuned image captions with a pretrained BLIP-2 vision-language model and combine them with visual features through an early fusion strategy. Experimental results demonstrate that multimodal agents achieve a 20% improvement in task completion rates.
arXiv Detail & Related papers (2025-04-04T04:15:52Z)
Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic Cognition [86.21199607040147]
Self-Improving cognition (SIcog) is a self-learning framework for constructing next-generation foundation language models. We introduce Chain-of-Description, a step-by-step visual understanding method, and integrate structured chain-of-thought (CoT) reasoning to support in-depth multimodal reasoning. Extensive experiments demonstrate that SIcog produces next-generation foundation MLLMs with substantially improved multimodal cognition.
arXiv Detail & Related papers (2025-03-16T00:25:13Z)
A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI [18.974297347310287]
Multi-modal generative AI systems rely on contrastive pre-training to learn representations across different modalities. This paper develops a theoretical framework to explain the success of contrastive pre-training in downstream tasks.
arXiv Detail & Related papers (2025-01-08T17:47:06Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Latent-Predictive Empowerment: Measuring Empowerment without a Simulator [56.53777237504011]
We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner. LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states.
arXiv Detail & Related papers (2024-10-15T00:41:18Z)
Multi-agent Off-policy Actor-Critic Reinforcement Learning for Partially Observable Environments [30.280532078714455]
This study proposes the use of a social learning method to estimate a global state within a multi-agent off-policy actor-critic algorithm for reinforcement learning. We show that the difference between final outcomes, obtained when the global state is fully observed versus estimated through the social learning method, is $varepsilon$-bounded when an appropriate number of iterations of social learning updates are implemented.
arXiv Detail & Related papers (2024-07-06T06:51:14Z)
Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment [69.33930972652594]
We propose a novel structural pruning approach to jointly learn the weights and structurally prune architectures of CNN models. The core element of our method is a Reinforcement Learning (RL) agent whose actions determine the pruning ratios of the CNN model's layers. We conduct the joint training and pruning by iteratively training the model's weights and the agent's policy.
arXiv Detail & Related papers (2024-03-28T15:22:29Z)
Contrastive learning-based agent modeling for deep reinforcement learning [31.293496061727932]
Agent modeling is essential when designing adaptive policies for intelligent machine agents in multiagent systems. We devised a Contrastive Learning-based Agent Modeling (CLAM) method that relies only on the local observations from the ego agent during training and execution. CLAM is capable of generating consistent high-quality policy representations in real-time right from the beginning of each episode.
arXiv Detail & Related papers (2023-12-30T03:44:12Z)
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning [15.12491397254381]
We propose an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. Under this method, agents can interact with the learned virtual environment and evaluate the current state value according to imagined future states.
arXiv Detail & Related papers (2022-04-20T12:16:27Z)
Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions. In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems. Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework. We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior. We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z)
Improving Robot Dual-System Motor Learning with Intrinsically Motivated Meta-Control and Latent-Space Experience Imagination [17.356402088852423]
We present a novel dual-system motor learning approach where a meta-controller arbitrates online between model-based and model-free decisions. We evaluate our approach against baseline and state-of-the-art methods on learning vision-based robotic grasping in simulation and real world.
arXiv Detail & Related papers (2020-04-19T12:14:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.