Related papers: Training Socially Aligned Language Models on Simulated Social Interactions

Training Socially Aligned Language Models on Simulated Social Interactions

URL: http://arxiv.org/abs/2305.16960v3
Date: Sat, 28 Oct 2023 09:02:39 GMT
Title: Training Socially Aligned Language Models on Simulated Social Interactions
Authors: Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi
Abstract summary: Social alignment in AI systems aims to ensure that these models behave according to established societal values. Current language models (LMs) are trained to rigidly replicate their training corpus in isolation. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
Score: 99.39979111807388
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attacks. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions. In comparison to existing methodologies, our approach is considerably more scalable and efficient, demonstrating superior performance in alignment benchmarks and human evaluations. This paradigm shift in the training of LMs brings us a step closer to developing AI systems that can robustly and accurately reflect societal norms and values.

Related papers

SocialEval: Evaluating Social Intelligence of Large Language Models [70.90981021629021]
Social Intelligence (SI) equips humans with interpersonal abilities to behave wisely in navigating social interactions to achieve social goals.<n>This presents an operational evaluation paradigm: outcome-oriented goal achievement evaluation and process-oriented interpersonal ability evaluation.<n>We propose SocialEval, a script-based bilingual SI benchmark, integrating outcome- and process-oriented evaluation by manually crafting narrative scripts.
arXiv Detail & Related papers (2025-06-01T08:36:51Z)
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users [70.02370111025617]
We introduce SocioVerse, an agent-driven world model for social simulation. Our framework features four powerful alignment components and a user pool of 10 million real individuals. Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness.
arXiv Detail & Related papers (2025-04-14T12:12:52Z)
Large Language Model Driven Agents for Simulating Echo Chamber Formation [5.6488384323017]
The rise of echo chambers on social media platforms has heightened concerns about polarization and the reinforcement of existing beliefs. Traditional approaches for simulating echo chamber formation have often relied on predefined rules and numerical simulations. We present a novel framework that leverages large language models (LLMs) as generative agents to simulate echo chamber dynamics.
arXiv Detail & Related papers (2025-02-25T12:05:11Z)
Word Synchronization Challenge: A Benchmark for Word Association Responses for LLMs [4.352318127577628]
This paper introduces the Word Synchronization Challenge, a novel benchmark to evaluate large language models (LLMs) in Human-Computer Interaction (HCI) This benchmark uses a dynamic game-like framework to test LLMs ability to mimic human cognitive processes through word associations.
arXiv Detail & Related papers (2025-02-12T11:30:28Z)
Fusing Dynamics Equation: A Social Opinions Prediction Algorithm with LLM-based Agents [6.1923703280119105]
This paper proposes an innovative simulation method for the dynamics of social media user opinions. The FDE-LLM algorithm incorporates opinion dynamics and epidemic model. It categorizes users into opinion leaders and followers.
arXiv Detail & Related papers (2024-09-13T11:02:28Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents [73.35393511272791]
We propose an interactive learning method, SOTOPIA-$pi$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings.
arXiv Detail & Related papers (2024-03-13T17:17:48Z)
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents [18.961470450132637]
This paper emphasizes the importance of spontaneous phenomena, wherein agents deeply engage in contexts and make adaptive decisions without explicit directions. We explored spontaneous cooperation across three competitive scenarios and successfully simulated the gradual emergence of cooperation.
arXiv Detail & Related papers (2024-02-19T18:00:53Z)
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans. In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals. We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z)
Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL. The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z)
Rethinking Model Evaluation as Narrowing the Socio-Technical Gap [34.08410116336628]
We argue that model evaluation practices must take on a critical task to cope with the challenges and responsibilities brought by this homogenization. We urge the community to develop evaluation methods based on real-world socio-requirements.
arXiv Detail & Related papers (2023-06-01T00:01:43Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)
Social Processes: Self-Supervised Forecasting of Nonverbal Cues in Social Conversations [22.302509912465077]
We take the first step in the direction of a bottom-up self-supervised approach in the domain of social human interactions. We formulate the task of Social Cue Forecasting to leverage the larger amount of unlabeled low-level behavior cues. We propose the Social Process (SP) models--socially aware sequence-to-sequence (Seq2Seq) models within the Neural Process (NP) family.
arXiv Detail & Related papers (2021-07-28T18:01:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.