Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion
Related to Harms of Misinformation
- URL: http://arxiv.org/abs/2401.16558v1
- Date: Mon, 29 Jan 2024 20:50:28 GMT
- Title: Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion
Related to Harms of Misinformation
- Authors: Terrence Neumann, Sooyong Lee, Maria De-Arteaga, Sina Fazelpour,
Matthew Lease
- Abstract summary: This paper examines whether a large language model (LLM) can reflect the views of various groups when assessing the harms of misinformation.
We present the TopicMisinfo dataset, containing 160 fact-checked claims from diverse topics.
We find that GPT 3.5-Turbo reflects empirically observed gender differences in opinion but amplifies the extent of these differences.
- Score: 8.066880413153187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The pervasive spread of misinformation and disinformation poses a significant
threat to society. Professional fact-checkers play a key role in addressing
this threat, but the vast scale of the problem forces them to prioritize their
limited resources. This prioritization may consider a range of factors, such as
varying risks of harm posed to specific groups of people. In this work, we
investigate potential implications of using a large language model (LLM) to
facilitate such prioritization. Because fact-checking impacts a wide range of
diverse segments of society, it is important that diverse views are represented
in the claim prioritization process. This paper examines whether a LLM can
reflect the views of various groups when assessing the harms of misinformation,
focusing on gender as a primary variable. We pose two central questions: (1) To
what extent do prompts with explicit gender references reflect gender
differences in opinion in the United States on topics of social relevance? and
(2) To what extent do gender-neutral prompts align with gendered viewpoints on
those topics? To analyze these questions, we present the TopicMisinfo dataset,
containing 160 fact-checked claims from diverse topics, supplemented by nearly
1600 human annotations with subjective perceptions and annotator demographics.
Analyzing responses to gender-specific and neutral prompts, we find that GPT
3.5-Turbo reflects empirically observed gender differences in opinion but
amplifies the extent of these differences. These findings illuminate AI's
complex role in moderating online communication, with implications for
fact-checkers, algorithm designers, and the use of crowd-workers as annotators.
We also release the TopicMisinfo dataset to support continuing research in the
community.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks [5.123567809055078]
Gender bias in vision-language models (VLMs) can reinforce harmful stereotypes and discrimination.
We propose GAMA, a task-agnostic generation framework to mitigate gender bias.
During narrative generation, GAMA yields all-sided but gender-obfuscated narratives.
During answer inference, GAMA integrates the image, generated narrative, and a task-specific question prompt to infer answers for different vision-language tasks.
arXiv Detail & Related papers (2024-05-27T06:20:58Z) - Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes [12.704072523930444]
This study investigates eleven strategies to automatically counter-act and challenge gender stereotypes in online communications.
We present AI-generated gender-based counter-stereotypes to study participants and ask them to assess their offensiveness, plausibility, and potential effectiveness.
arXiv Detail & Related papers (2024-04-18T01:48:28Z) - A Multilingual Perspective on Probing Gender Bias [0.0]
Gender bias is a form of systematic negative treatment that targets individuals based on their gender.
This thesis investigates the nuances of how gender bias is expressed through language and within language technologies.
arXiv Detail & Related papers (2024-03-15T21:35:21Z) - Understanding Divergent Framing of the Supreme Court Controversies:
Social Media vs. News Outlets [56.67097829383139]
We focus on the nuanced distinctions in framing of social media and traditional media outlets concerning a series of U.S. Supreme Court rulings.
We observe significant polarization in the news media's treatment of affirmative action and abortion rights, whereas the topic of student loans tends to exhibit a greater degree of consensus.
arXiv Detail & Related papers (2023-09-18T06:40:21Z) - Unveiling Gender Bias in Terms of Profession Across LLMs: Analyzing and
Addressing Sociological Implications [0.0]
The study examines existing research on gender bias in AI language models and identifies gaps in the current knowledge.
The findings shed light on gendered word associations, language usage, and biased narratives present in the outputs of Large Language Models.
The paper presents strategies for reducing gender bias in LLMs, including algorithmic approaches and data augmentation techniques.
arXiv Detail & Related papers (2023-07-18T11:38:45Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Text as Causal Mediators: Research Design for Causal Estimates of
Differential Treatment of Social Groups via Language Aspects [7.175621752912443]
We propose a causal research design for observational (non-experimental) data to estimate the natural direct and indirect effects of social group signals on speakers' responses.
We illustrate the promises and challenges of this framework via a theoretical case study of the effect of an advocate's gender on interruptions from justices during U.S. Supreme Court oral arguments.
arXiv Detail & Related papers (2021-09-15T19:15:35Z) - Gender bias in magazines oriented to men and women: a computational
approach [58.720142291102135]
We compare the content of a women-oriented magazine with that of a men-oriented one, both produced by the same editorial group over a decade.
With Topic Modelling techniques we identify the main themes discussed in the magazines and quantify how much the presence of these topics differs between magazines over time.
Our results show that the frequency of appearance of the topics Family, Business and Women as sex objects, present an initial bias that tends to disappear over time.
arXiv Detail & Related papers (2020-11-24T14:02:49Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.