Related papers: One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

URL: http://arxiv.org/abs/2602.03109v1
Date: Tue, 03 Feb 2026 05:09:49 GMT
Title: One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence
Authors: Bowen Jiang, Taiwei Shi, Ryo Kamoi, Yuan Yuan, Camillo J. Taylor, Longqi Yang, Pei Zhou, Sihao Chen,
Abstract summary: This paper introduces OMAR: One Model, All Roles, a reinforcement learning framework for AI.<n>OMAR allows a single model to role-play all participants in a conversation simultaneously, learning to achieve long-term goals and complex social norms.<n>We show that trained models develop fine-grained, emergent social intelligence, such as empathy, persuasion, and compromise seeking.
Score: 25.89075578734277
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces OMAR: One Model, All Roles, a reinforcement learning framework that enables AI to develop social intelligence through multi-turn, multi-agent conversational self-play. Unlike traditional paradigms that rely on static, single-turn optimizations, OMAR allows a single model to role-play all participants in a conversation simultaneously, learning to achieve long-term goals and complex social norms directly from dynamic social interaction. To ensure training stability across long dialogues, we implement a hierarchical advantage estimation that calculates turn-level and token-level advantages. Evaluations in the SOTOPIA social environment and Werewolf strategy games show that our trained models develop fine-grained, emergent social intelligence, such as empathy, persuasion, and compromise seeking, demonstrating the effectiveness of learning collaboration even under competitive scenarios. While we identify practical challenges like reward hacking, our results show that rich social intelligence can emerge without human supervision. We hope this work incentivizes further research on AI social intelligence in group conversations.

Related papers

Sotopia-RL: Reward Design for Social Intelligence [52.59432715228559]
Sotopia-RL is a novel framework that refines coarse episode-level feedback into utterance-level, multi-dimensional rewards.<n>Experiments in Sotopia, an open-ended social learning environment, demonstrate that Sotopia-RL achieves state-of-the-art social goal completion scores.
arXiv Detail & Related papers (2025-08-05T20:43:42Z)
LIFELONG SOTOPIA: Evaluating Social Intelligence of Language Agents Over Lifelong Social Interactions [4.819825467587802]
We present a novel benchmark, LIFELONG-SOTOPIA, to perform a comprehensive evaluation of language agents.<n>We find that goal achievement and believability of all of the language models that we test decline through the whole interaction.<n>These findings show that we can use LIFELONG-SOTOPIA to evaluate the social intelligence of language agents over lifelong social interactions.
arXiv Detail & Related papers (2025-06-14T23:57:54Z)
Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning [42.09560737219404]
Large Language Models show promise in human-like communication, but their standalone use is hindered by memory constraints and contextual incoherence.<n>This work presents a multimodal, cognitively inspired framework that enhances LLM-based autonomous decision-making in social and task-oriented Human-Robot Interaction.<n>To further enhance autonomy and personalization, we introduce a memory system for selecting, storing and retrieving experiences.
arXiv Detail & Related papers (2025-04-02T10:45:41Z)
Towards Anthropomorphic Conversational AI Part I: A Practical Framework [49.62013440962072]
We introduce a multi- module framework designed to replicate the key aspects of human intelligence involved in conversations.<n>In the second stage of our approach, these conversational data, after filtering and labeling, can serve as training and testing data for reinforcement learning.
arXiv Detail & Related papers (2025-02-28T03:18:39Z)
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions [67.60397632819202]
Building socially-intelligent AI agents (Social-AI) is a multidisciplinary, multimodal research goal. We identify a set of underlying technical challenges and open questions for researchers across computing communities to advance Social-AI.
arXiv Detail & Related papers (2024-04-17T02:57:42Z)
SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents [73.35393511272791]
We propose an interactive learning method, SOTOPIA-$pi$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings.
arXiv Detail & Related papers (2024-03-13T17:17:48Z)
SoMeLVLM: A Large Vision Language Model for Social Media Processing [78.47310657638567]
We introduce a Large Vision Language Model for Social Media Processing (SoMeLVLM) SoMeLVLM is a cognitive framework equipped with five key capabilities including knowledge & comprehension, application, analysis, evaluation, and creation. Our experiments demonstrate that SoMeLVLM achieves state-of-the-art performance in multiple social media tasks.
arXiv Detail & Related papers (2024-02-20T14:02:45Z)
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans. In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals. We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z)
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents [23.719833581321033]
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI. We argue that aiming towards human-level AI requires a broader set of key social skills. We present SocialAI, a benchmark to assess the acquisition of social skills of DRL agents.
arXiv Detail & Related papers (2021-07-02T10:39:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.