SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas
- URL: http://arxiv.org/abs/2503.14576v2
- Date: Mon, 19 May 2025 15:28:35 GMT
- Title: SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas
- Authors: Zihao Guo, Shuqing Shi, Richard Willis, Tristan Tomilin, Joel Z. Leibo, Yali Du,
- Abstract summary: Sequential social dilemmas pose a significant challenge in the field of multi-agent reinforcement learning.<n>SocialJax is a suite of sequential social dilemma environments and algorithms implemented in JAX.<n>JAX is a high-performance numerical computing library for Python.
- Score: 3.897833712166508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequential social dilemmas pose a significant challenge in the field of multi-agent reinforcement learning (MARL), requiring environments that accurately reflect the tension between individual and collective interests. Previous benchmarks and environments, such as Melting Pot, provide an evaluation protocol that measures generalization to new social partners in various test scenarios. However, running reinforcement learning algorithms in traditional environments requires substantial computational resources. In this paper, we introduce SocialJax, a suite of sequential social dilemma environments and algorithms implemented in JAX. JAX is a high-performance numerical computing library for Python that enables significant improvements in operational efficiency. Our experiments demonstrate that the SocialJax training pipeline achieves at least 50\texttimes{} speed-up in real-time performance compared to Melting Pot RLlib baselines. Additionally, we validate the effectiveness of baseline algorithms within SocialJax environments. Finally, we use Schelling diagrams to verify the social dilemma properties of these environments, ensuring that they accurately capture the dynamics of social dilemmas.
Related papers
- SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning [53.16179295245888]
We introduce SIV-Bench, a novel video benchmark for evaluating the capabilities of Multimodal Large Language Models (MLLMs) across Social Scene Understanding (SSU), Social State Reasoning (SSR), and Social Dynamics Prediction (SDP)<n>SIV-Bench features 2,792 video clips and 8,792 meticulously generated question-answer pairs derived from a human-LLM collaborative pipeline.<n>It also includes a dedicated setup for analyzing the impact of different textual cues-original on-screen text, added dialogue, or no text.
arXiv Detail & Related papers (2025-06-05T05:51:35Z) - SCRAG: Social Computing-Based Retrieval Augmented Generation for Community Response Forecasting in Social Media Environments [8.743208265682014]
SCRAG is a prediction framework inspired by social computing.
It forecast community responses to real or hypothetical social media posts.
It can be used by public relations specialists to craft messaging in ways that avoid unintended misinterpretations.
arXiv Detail & Related papers (2025-04-18T15:02:31Z) - How Social is It? A Benchmark for LLMs' Capabilities in Multi-user Multi-turn Social Agent Tasks [6.487500253901779]
Large language models (LLMs) play roles in multi-user, multi-turn social agent tasks.<n>We propose a novel benchmark, How Social Is It (we call it HSII below), designed to assess LLM's social capabilities.<n>HSII comprises four stages: format parsing, target selection, target switching conversation, and stable conversation, which collectively evaluate the communication and task completion capabilities of LLMs.
arXiv Detail & Related papers (2025-04-04T08:59:01Z) - SocialED: A Python Library for Social Event Detection [53.928241775629566]
SocialED is a comprehensive, open-source Python library designed to support social event detection (SED) tasks.<n>It provides a unified API with detailed documentation, offering researchers and practitioners a complete solution for event detection in social media.<n>SocialED supports a wide range of preprocessing techniques, such as graph construction and tokenization, and includes standardized interfaces for training models and making predictions.
arXiv Detail & Related papers (2024-12-18T03:37:47Z) - Engagement-Driven Content Generation with Large Language Models [8.049552839071918]
Large Language Models (LLMs) demonstrate significant persuasive capabilities in one-on-one interactions.<n>Their influence within social networks, where interconnected users and complex opinion dynamics pose unique challenges, remains underexplored.<n>This paper addresses the research question: emphCan LLMs generate meaningful content that maximizes user engagement on social networks?
arXiv Detail & Related papers (2024-11-20T10:40:08Z) - GenSim: A General Social Simulation Platform with Large Language Model based Agents [111.00666003559324]
We propose a novel large language model (LLMs)-based simulation platform called textitGenSim.
Our platform supports one hundred thousand agents to better simulate large-scale populations in real-world contexts.
To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform.
arXiv Detail & Related papers (2024-10-06T05:02:23Z) - SocialGFs: Learning Social Gradient Fields for Multi-Agent Reinforcement Learning [58.84311336011451]
We propose a novel gradient-based state representation for multi-agent reinforcement learning.
We employ denoising score matching to learn the social gradient fields (SocialGFs) from offline samples.
In practice, we integrate SocialGFs into the widely used multi-agent reinforcement learning algorithms, e.g., MAPPO.
arXiv Detail & Related papers (2024-05-03T04:12:19Z) - SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks [13.152622137022881]
We introduce Socialite-Llama -- an open-source, instruction-tuned Llama.
On a suite of 20 social science tasks, Socialite-Llama improves upon the performance of Llama as well as matches or improves upon the performance of a state-of-the-art, multi-task finetuned model.
arXiv Detail & Related papers (2024-02-03T01:33:16Z) - SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans.
In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals.
We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - SocNavGym: A Reinforcement Learning Gym for Social Navigation [0.0]
SocNavGym is an advanced simulation environment for social navigation.
It can generate different types of social navigation scenarios.
It can also be configured to work with different hand-crafted and data-driven social reward signals.
arXiv Detail & Related papers (2023-04-27T11:29:02Z) - JaxPruner: A concise library for sparsity research [46.153423603424]
JaxPruner is an open-source library for sparse neural network research.
It implements popular pruning and sparse training algorithms with minimal memory and latency overhead.
arXiv Detail & Related papers (2023-04-27T10:45:30Z) - marl-jax: Multi-Agent Reinforcement Leaning Framework [7.064383217512461]
We present marl-jax, a multi-agent reinforcement learning software package for training and evaluating social generalization of the agents.
The package is designed for training a population of agents in multi-agent environments and evaluating their ability to generalize to diverse background agents.
arXiv Detail & Related papers (2023-03-24T05:05:01Z) - Agent-based Simulation for Online Mental Health Matching [16.968538290877444]
We collaborate with one of the world's largest online mental health communities to develop an agent-based simulation framework.
Our findings include that usage of the deferred-acceptance algorithm can significantly better the experiences of support-seekers in one-on-one chats.
arXiv Detail & Related papers (2023-03-20T17:04:59Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.