Embedding Democratic Values into Social Media AIs via Societal Objective
Functions
- URL: http://arxiv.org/abs/2307.13912v3
- Date: Thu, 15 Feb 2024 00:41:00 GMT
- Title: Embedding Democratic Values into Social Media AIs via Societal Objective
Functions
- Authors: Chenyan Jia, Michelle S. Lam, Minh Chau Mai, Jeff Hancock, Michael S.
Bernstein
- Abstract summary: We introduce a method for translating established, vetted social scientific constructs into AI objective functions.
We create a democratic attitude model that estimates the extent to which a social media post promotes anti-democratic attitudes.
This method presents a novel strategy to draw on social science theory and methods to mitigate societal harms in social media AIs.
- Score: 13.903836222333977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Can we design artificial intelligence (AI) systems that rank our social media
feeds to consider democratic values such as mitigating partisan animosity as
part of their objective functions? We introduce a method for translating
established, vetted social scientific constructs into AI objective functions,
which we term societal objective functions, and demonstrate the method with
application to the political science construct of anti-democratic attitudes.
Traditionally, we have lacked observable outcomes to use to train such models,
however, the social sciences have developed survey instruments and qualitative
codebooks for these constructs, and their precision facilitates translation
into detailed prompts for large language models. We apply this method to create
a democratic attitude model that estimates the extent to which a social media
post promotes anti-democratic attitudes, and test this democratic attitude
model across three studies. In Study 1, we first test the attitudinal and
behavioral effectiveness of the intervention among US partisans (N=1,380) by
manually annotating (alpha=.895) social media posts with anti-democratic
attitude scores and testing several feed ranking conditions based on these
scores. Removal (d=.20) and downranking feeds (d=.25) reduced participants'
partisan animosity without compromising their experience and engagement. In
Study 2, we scale up the manual labels by creating the democratic attitude
model, finding strong agreement with manual labels (rho=.75). Finally, in Study
3, we replicate Study 1 using the democratic attitude model instead of manual
labels to test its attitudinal and behavioral impact (N=558), and again find
that the feed downranking using the societal objective function reduced
partisan animosity (d=.25). This method presents a novel strategy to draw on
social science theory and methods to mitigate societal harms in social media
AIs.
Related papers
- Learning Goal-oriented Bimanual Dough Rolling Using Dynamic Heterogeneous Graph Based on Human Demonstration [19.74767906744719]
Soft object manipulation poses significant challenges for robots, requiring effective techniques for state representation and manipulation policy learning.
This research paper introduces a novel approach: a dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies.
arXiv Detail & Related papers (2024-10-15T16:12:00Z) - From Experts to the Public: Governing Multimodal Language Models in Politically Sensitive Video Analysis [48.14390493099495]
This paper examines the governance of large language models (MM-LLMs) through individual and collective deliberation.
We conducted a two-step study: first, interviews with 10 journalists established a baseline understanding of expert video interpretation; second, 114 individuals from the general public engaged in deliberation using Inclusive.AI.
arXiv Detail & Related papers (2024-09-15T03:17:38Z) - Towards "Differential AI Psychology" and in-context Value-driven Statement Alignment with Moral Foundations Theory [0.0]
This work investigates the alignment between personalized language models and survey participants on a Moral Foundation questionnaire.
We adapt text-to-text models to different political personas and survey the questionnaire repetitively to generate a synthetic population of persona and model combinations.
Our findings indicate that adapted models struggle to represent the survey-leading assessment of political ideologies.
arXiv Detail & Related papers (2024-08-21T08:20:41Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - Modelling Human Values for AI Reasoning [2.320648715016106]
We detail a formal model of human values for their explicit computational representation.
We show how this model can provide the foundational apparatus for AI-based reasoning over values.
We propose a roadmap for future integrated, and interdisciplinary, research into human values in AI.
arXiv Detail & Related papers (2024-02-09T12:08:49Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Learning affective meanings that derives the social behavior using
Bidirectional Encoder Representations from Transformers [0.0]
Affect Control Theory (ACT) uses sentiments to manifest potential interaction.
Model achieves state-of-the-art accuracy in estimating affective meanings.
arXiv Detail & Related papers (2022-01-31T19:58:28Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.