Emergent Social Learning via Multi-agent Reinforcement Learning
- URL: http://arxiv.org/abs/2010.00581v3
- Date: Tue, 22 Jun 2021 21:18:59 GMT
- Title: Emergent Social Learning via Multi-agent Reinforcement Learning
- Authors: Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques
- Abstract summary: Social learning is a key component of human and animal intelligence.
This paper investigates whether independent reinforcement learning agents can learn to use social learning to improve their performance.
- Score: 91.57176641192771
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Social learning is a key component of human and animal intelligence. By
taking cues from the behavior of experts in their environment, social learners
can acquire sophisticated behavior and rapidly adapt to new circumstances. This
paper investigates whether independent reinforcement learning (RL) agents in a
multi-agent environment can learn to use social learning to improve their
performance. We find that in most circumstances, vanilla model-free RL agents
do not use social learning. We analyze the reasons for this deficiency, and
show that by imposing constraints on the training environment and introducing a
model-based auxiliary loss we are able to obtain generalized social learning
policies which enable agents to: i) discover complex skills that are not
learned from single-agent training, and ii) adapt online to novel environments
by taking cues from experts present in the new environment. In contrast, agents
trained with model-free RL or imitation learning generalize poorly and do not
succeed in the transfer tasks. By mixing multi-agent and solo training, we can
obtain agents that use social learning to gain skills that they can deploy when
alone, even out-performing agents trained alone from the start.
Related papers
- SocialGFs: Learning Social Gradient Fields for Multi-Agent Reinforcement Learning [58.84311336011451]
We propose a novel gradient-based state representation for multi-agent reinforcement learning.
We employ denoising score matching to learn the social gradient fields (SocialGFs) from offline samples.
In practice, we integrate SocialGFs into the widely used multi-agent reinforcement learning algorithms, e.g., MAPPO.
arXiv Detail & Related papers (2024-05-03T04:12:19Z) - SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents [73.35393511272791]
We propose an interactive learning method, SOTOPIA-$pi$, improving the social intelligence of language agents.
This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings.
arXiv Detail & Related papers (2024-03-13T17:17:48Z) - Empowering Large Language Model Agents through Action Learning [85.39581419680755]
Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error.
We argue that the capacity to learn new actions from experience is fundamental to the advancement of learning in LLM agents.
We introduce a framework LearnAct with an iterative learning strategy to create and improve actions in the form of Python functions.
arXiv Detail & Related papers (2024-02-24T13:13:04Z) - Social learning spontaneously emerges by searching optimal heuristics
with deep reinforcement learning [0.0]
We employ a deep reinforcement learning model to optimize the social learning strategies of agents in a cooperative game in a multi-dimensional landscape.
We find that the agent spontaneously learns various concepts of social learning, such as copying, focusing on frequent and well-performing neighbors, self-comparison, and the importance of balancing between individual and social learning.
We demonstrate the superior performance of the reinforcement learning agent in various environments, including temporally changing environments and real social networks.
arXiv Detail & Related papers (2022-04-26T15:10:27Z) - Help Me Explore: Minimal Social Interventions for Graph-Based Autotelic
Agents [7.644107117422287]
This paper argues that both perspectives could be coupled within the learning of autotelic agents to foster their skill acquisition.
We make two contributions: 1) a novel social interaction protocol called Help Me Explore (HME), where autotelic agents can benefit from both individual and socially guided exploration.
We show that when learning within HME, GANGSTR overcomes its individual learning limits by mastering the most complex configurations.
arXiv Detail & Related papers (2022-02-10T16:34:28Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Emergent Reciprocity and Team Formation from Randomized Uncertain Social
Preferences [8.10414043447031]
We show evidence of emergent direct reciprocity, indirect reciprocity and reputation, and team formation when training agents with randomized uncertain social preferences (RUSP)
RUSP is generic and scalable; it can be applied to any multi-agent environment without changing the original underlying game dynamics or objectives.
In particular, we show that with RUSP these behaviors can emerge and lead to higher social welfare equilibria in both classic abstract social dilemmas like Iterated Prisoner's Dilemma as well in more complex intertemporal environments.
arXiv Detail & Related papers (2020-11-10T20:06:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.