Related papers: GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives

GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives

URL: http://arxiv.org/abs/2311.05074v2
Date: Fri, 14 Jun 2024 03:28:49 GMT
Title: GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives
Authors: Vinodkumar Prabhakaran, Christopher Homan, Lora Aroyo, Aida Mostafazadeh Davani, Alicia Parrish, Alex Taylor, Mark Díaz, Ding Wang, Gregory Serapio-García,
Abstract summary: We propose GRASP, a comprehensive disagreement analysis framework to measure group association in perspectives among different rater sub-groups. Our framework reveals specific rater groups that have significantly different perspectives than others on certain tasks, and helps identify demographic axes that are crucial to consider in specific task contexts.
Score: 18.574420136899978
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human annotation plays a core role in machine learning -- annotations for supervised models, safety guardrails for generative models, and human feedback for reinforcement learning, to cite a few avenues. However, the fact that many of these human annotations are inherently subjective is often overlooked. Recent work has demonstrated that ignoring rater subjectivity (typically resulting in rater disagreement) is problematic within specific tasks and for specific subgroups. Generalizable methods to harness rater disagreement and thus understand the socio-cultural leanings of subjective tasks remain elusive. In this paper, we propose GRASP, a comprehensive disagreement analysis framework to measure group association in perspectives among different rater sub-groups, and demonstrate its utility in assessing the extent of systematic disagreements in two datasets: (1) safety annotations of human-chatbot conversations, and (2) offensiveness annotations of social media posts, both annotated by diverse rater pools across different socio-demographic axes. Our framework (based on disagreement metrics) reveals specific rater groups that have significantly different perspectives than others on certain tasks, and helps identify demographic axes that are crucial to consider in specific task contexts.

Related papers

A Comprehensive Framework to Operationalize Social Stereotypes for Responsible AI Evaluations [15.381034360289899]
Societal stereotypes are at the center of a myriad of responsible AI interventions. We propose a unified framework to operationalize stereotypes in generative AI evaluations.
arXiv Detail & Related papers (2025-01-03T19:39:48Z)
The Superalignment of Superhuman Intelligence with Large Language Models [63.96120398355404]
We discuss the concept of superalignment from the learning perspective to answer this question. We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation. We present a conceptual framework for superalignment, which consists of three modules: an attacker which generates adversary queries trying to expose the weaknesses of a learner model; a learner which will refine itself by learning from scalable feedbacks generated by a critic model along with minimal human experts; and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing.
arXiv Detail & Related papers (2024-12-15T10:34:06Z)
Overview of PerpectiveArg2024: The First Shared Task on Perspective Argument Retrieval [56.66761232081188]
We present a novel dataset covering demographic and socio-cultural (socio) variables, such as age, gender, and political attitude, representing minority and majority groups in society. We find substantial challenges in incorporating perspectivism, especially when aiming for personalization based solely on the text of arguments without explicitly providing socio profiles. While we bootstrap perspective argument retrieval, further research is essential to optimize retrieval systems to facilitate personalization and reduce polarization.
arXiv Detail & Related papers (2024-07-29T03:14:57Z)
An Empirical Analysis of Diversity in Argument Summarization [4.128725138940779]
We introduce three aspects of diversity: those of opinions, annotators, and sources. We evaluate approaches to a popular argument summarization task called Key Point Analysis.
arXiv Detail & Related papers (2024-02-02T16:26:52Z)
Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment. We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections. We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z)
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities [3.0284081180864675]
This study aims to identify intuitive variances from annotator disagreement using quantitative analysis. We also evaluate the model's ability to mimic diverse viewpoints on toxicity by varying size of the training data. We conclude that subjectivity is evident across all annotator groups, demonstrating the shortcomings of majority-rule voting.
arXiv Detail & Related papers (2023-11-01T00:17:11Z)
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations. We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z)
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness. We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks. Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z)
Is Attention Interpretation? A Quantitative Assessment On Sets [0.0]
We study the interpretability of attention in the context of set machine learning. We find that attention distributions are indeed often reflective of the relative importance of individual instances. We propose to use ensembling to minimize the risk of misleading attention-based explanations.
arXiv Detail & Related papers (2022-07-26T16:25:38Z)
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks [95.29345070102045]
In this paper, we focus our investigation on social bias detection of dialog safety problems. We first propose a novel Dial-Bias Frame for analyzing the social bias in conversations pragmatically. We introduce CDail-Bias dataset that is the first well-annotated Chinese social bias dialog dataset.
arXiv Detail & Related papers (2022-02-16T11:59:29Z)
Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning [91.58529629419135]
We consider how to characterise visual groupings discovered automatically by deep neural networks. We introduce two concepts, visual learnability and describability, that can be used to quantify the interpretability of arbitrary image groupings.
arXiv Detail & Related papers (2020-10-27T18:41:49Z)
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis. We learn sentiment, aspect> joint topic embeddings in the word embedding space. We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.