Investigating social alignment via mirroring in a system of interacting language models
- URL: http://arxiv.org/abs/2412.06834v2
- Date: Sat, 15 Feb 2025 23:16:43 GMT
- Title: Investigating social alignment via mirroring in a system of interacting language models
- Authors: Harvey McGuinness, Tianyu Wang, Carey E. Priebe, Hayden Helm,
- Abstract summary: We study the effect of mirroring on alignment in multi-agent systems.
We simulate systems of interacting large language models in this framework.
We find that system behavior is strongly influenced by the range of communication of each agent.
- Score: 16.304359423423648
- License:
- Abstract: Alignment is a social phenomenon wherein individuals share a common goal or perspective. Mirroring, or mimicking the behaviors and opinions of another individual, is one mechanism by which individuals can become aligned. Large scale investigations of the effect of mirroring on alignment have been limited due to the scalability of traditional experimental designs in sociology. In this paper, we introduce a simple computational framework that enables studying the effect of mirroring behavior on alignment in multi-agent systems. We simulate systems of interacting large language models in this framework and characterize overall system behavior and alignment with quantitative measures of agent dynamics. We find that system behavior is strongly influenced by the range of communication of each agent and that these effects are exacerbated by increased rates of mirroring. We discuss the observed simulated system behavior in the context of known human social dynamics.
Related papers
- Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems [55.99010491370177]
How to intervene on such system outputs to mitigate anthropomorphic behaviors and their attendant harmful outcomes remains understudied.
We compile an inventory of interventions grounded both in prior literature and a crowdsourced study where participants edited system outputs to make them less human-like.
arXiv Detail & Related papers (2025-02-19T18:06:37Z) - Diffusion-Based Imitation Learning for Social Pose Generation [0.0]
Intelligent agents, such as robots and virtual agents, must understand the dynamics of complex social interactions to interact with humans.
We explore how using a single modality, the pose behavior, of multiple individuals in a social interaction can be used to generate nonverbal social cues for the facilitator of that interaction.
arXiv Detail & Related papers (2025-01-18T20:31:55Z) - Emergence of human-like polarization among large language model agents [61.622596148368906]
We simulate a networked system involving thousands of large language model agents, discovering their social interactions, result in human-like polarization.
Similarities between humans and LLM agents raise concerns about their capacity to amplify societal polarization, but also hold the potential to serve as a valuable testbed for identifying plausible strategies to mitigate it.
arXiv Detail & Related papers (2025-01-09T11:45:05Z) - Exploring the Impact of Reflexivity Theory and Cognitive Social Structures on the Dynamics of Doctor-Patient Social System [0.0]
We create two different models for a doctor-patient system.
One retains the established assumptions, while the other incorporates principles of reflexivity theory and cognitive social structures.
We utilize a microbial genetic algorithm to optimize the behaviour of the physician and patient agents in both models.
arXiv Detail & Related papers (2024-11-08T23:23:13Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Behavior-Inspired Neural Networks for Relational Inference [3.7219180084857473]
Recent works learn to categorize relationships between agents based on observations of their physical behavior.
We introduce a level of abstraction between the observable behavior of agents and the latent categories that determine their behavior.
We integrate the physical proximity of agents and their preferences in a nonlinear opinion dynamics model which provides a mechanism to identify mutually exclusive latent categories, predict an agent's evolution in time, and control an agent's physical behavior.
arXiv Detail & Related papers (2024-06-20T21:36:54Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - Attacks in Adversarial Machine Learning: A Systematic Survey from the
Life-cycle Perspective [69.25513235556635]
Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans.
Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system.
We propose a unified mathematical framework to covering existing attack paradigms.
arXiv Detail & Related papers (2023-02-19T02:12:21Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Models we Can Trust: Toward a Systematic Discipline of (Agent-Based)
Model Interpretation and Validation [0.0]
We advocate the development of a discipline of interacting with and extracting information from models.
We outline some directions for the development of a such a discipline.
arXiv Detail & Related papers (2021-02-23T10:52:22Z) - Modelling Cooperation in Network Games with Spatio-Temporal Complexity [11.665246332943058]
We study the emergence of self-organized cooperation in complex gridworld domains.
Using multi-agent deep reinforcement learning, we simulate an agent society for a variety of plausible mechanisms.
Our methods have implications for mechanism design in both human and artificial agent systems.
arXiv Detail & Related papers (2021-02-13T12:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.