Learning Roles with Emergent Social Value Orientations
- URL: http://arxiv.org/abs/2301.13812v1
- Date: Tue, 31 Jan 2023 17:54:09 GMT
- Title: Learning Roles with Emergent Social Value Orientations
- Authors: Wenhao Li, Xiangfeng Wang, Bo Jin, Jingyi Lu and Hongyuan Zha
- Abstract summary: This paper introduces the typical "division of labor or roles" mechanism in human society.
We provide a promising solution for intertemporal social dilemmas (ISD) with social value orientations (SVO)
A novel learning framework, called Learning Roles with Emergent SVOs (RESVO), is proposed to transform the learning of roles into the social value orientation emergence.
- Score: 49.16026283952117
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social dilemmas can be considered situations where individual rationality
leads to collective irrationality. The multi-agent reinforcement learning
community has leveraged ideas from social science, such as social value
orientations (SVO), to solve social dilemmas in complex cooperative tasks. In
this paper, by first introducing the typical "division of labor or roles"
mechanism in human society, we provide a promising solution for intertemporal
social dilemmas (ISD) with SVOs. A novel learning framework, called Learning
Roles with Emergent SVOs (RESVO), is proposed to transform the learning of
roles into the social value orientation emergence, which is symmetrically
solved by endowing agents with altruism to share rewards with other agents. An
SVO-based role embedding space is then constructed by individual conditioning
policies on roles with a novel rank regularizer and mutual information
maximizer. Experiments show that RESVO achieves a stable division of labor and
cooperation in ISDs with different complexity.
Related papers
- Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games [47.8980880888222]
Multi-agent scenarios often involve mixed motives, demanding altruistic agents capable of self-protection against potential exploitation.
We propose LASE Learning to balance Altruism and Self-interest based on Empathy.
LASE allocates a portion of its rewards to co-players as gifts, with this allocation adapting dynamically based on the social relationship.
arXiv Detail & Related papers (2024-10-10T12:30:56Z) - The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game [0.0]
Reinforcement learning provides a suitable framework for studying evolutionary game theory.
We employ the State-Action-Reward-State-Action algorithm as the decision-making mechanism for individuals in evolutionary game theory.
We evaluate the impact of SARSA on cooperation rates by analyzing variations in rewards and the distribution of cooperators and defectors within the network.
arXiv Detail & Related papers (2024-06-25T07:21:35Z) - SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents [107.4138224020773]
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and humans.
In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals.
We find that GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills.
arXiv Detail & Related papers (2023-10-18T02:27:01Z) - Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View [60.80731090755224]
This paper probes the collaboration mechanisms among contemporary NLP systems by practical experiments with theoretical insights.
We fabricate four unique societies' comprised of LLM agents, where each agent is characterized by a specific trait' (easy-going or overconfident) and engages in collaboration with a distinct thinking pattern' (debate or reflection)
Our results further illustrate that LLM agents manifest human-like social behaviors, such as conformity and consensus reaching, mirroring social psychology theories.
arXiv Detail & Related papers (2023-10-03T15:05:52Z) - Adaptive Coordination in Social Embodied Rearrangement [49.35582108902819]
We study zero-shot coordination (ZSC) in this task, where an agent collaborates with a new partner, emulating a scenario where a robot collaborates with a new human partner.
We propose Behavior Diversity Play (BDP), a novel ZSC approach that encourages diversity through a discriminability objective.
Our results demonstrate that BDP learns adaptive agents that can tackle visual coordination, and zero-shot generalize to new partners in unseen environments, achieving 35% higher success and 32% higher efficiency compared to baselines.
arXiv Detail & Related papers (2023-05-31T18:05:51Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - Social Value Orientation and Integral Emotions in Multi-Agent Systems [1.5469452301122173]
Human social behavior is influenced by individual differences in social preferences.
Social value orientation (SVO) is a measurable personality trait.
Integral emotions, the emotions which arise in direct response to a decision-making scenario, have been linked to temporary shifts in decision-making preferences.
arXiv Detail & Related papers (2023-05-09T15:33:50Z) - Heterogeneous Social Value Orientation Leads to Meaningful Diversity in
Sequential Social Dilemmas [15.171556039829161]
Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others.
Prior studies show that groups of agents endowed with heterogeneous SVO learn diverse policies in settings that resemble the incentive structure of Prisoner's dilemma.
We show that these best-response agents learn policies that are conditioned on their co-players, which we posit is the reason for improved zero-shot generalization results.
arXiv Detail & Related papers (2023-05-01T11:09:23Z) - The emergence of division of labor through decentralized social
sanctioning [13.35559831585528]
We show that by introducing a model of social norms, it becomes possible for groups of self-interested individuals to learn a productive division of labor involving all critical roles.
Such social norms work by redistributing rewards within the population to disincentivize antisocial roles while incentivizing prosocial roles that do not intrinsically pay as well as others.
arXiv Detail & Related papers (2022-08-10T21:35:38Z) - Improved cooperation by balancing exploration and exploitation in
intertemporal social dilemma tasks [2.541277269153809]
We propose a new learning strategy for achieving coordination by incorporating a learning rate that can balance exploration and exploitation.
We show that agents that use the simple strategy improve a relatively collective return in a decision task called the intertemporal social dilemma.
We also explore the effects of the diversity of learning rates on the population of reinforcement learning agents and show that agents trained in heterogeneous populations develop particularly coordinated policies.
arXiv Detail & Related papers (2021-10-19T08:40:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.