Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions
- URL: http://arxiv.org/abs/2502.20897v1
- Date: Fri, 28 Feb 2025 09:53:42 GMT
- Title: Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions
- Authors: Matthias Orlikowski, Jiaxin Pei, Paul Röttger, Philipp Cimiano, David Jurgens, Dirk Hovy,
- Abstract summary: We show that models do improve in sociodemographic prompting when trained.<n>This performance gain is largely due to models learning annotator-specific behaviour rather than sociodemographic patterns.<n>Across all tasks, our results suggest that models learn little meaningful connection between sociodemographics and annotation.
- Score: 33.76973308687867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: People naturally vary in their annotations for subjective questions and some of this variation is thought to be due to the person's sociodemographic characteristics. LLMs have also been used to label data, but recent work has shown that models perform poorly when prompted with sociodemographic attributes, suggesting limited inherent sociodemographic knowledge. Here, we ask whether LLMs can be trained to be accurate sociodemographic models of annotator variation. Using a curated dataset of five tasks with standardized sociodemographics, we show that models do improve in sociodemographic prompting when trained but that this performance gain is largely due to models learning annotator-specific behaviour rather than sociodemographic patterns. Across all tasks, our results suggest that models learn little meaningful connection between sociodemographics and annotation, raising doubts about the current use of LLMs for simulating sociodemographic variation and behaviour.
Related papers
- Understanding Graphical Perception in Data Visualization through Zero-shot Prompting of Vision-Language Models [23.571294524129847]
Vision Language Models (VLMs) have been successful at many chart comprehension tasks.
This paper lays the foundations for such applications by evaluating the accuracy of zero-shot prompting of VLMs on graphical perception tasks with established human performance profiles.
arXiv Detail & Related papers (2024-10-31T23:24:46Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - Sociodemographic Prompting is Not Yet an Effective Approach for Simulating Subjective Judgments with LLMs [13.744746481528711]
Large Language Models (LLMs) are widely used to simulate human responses across diverse contexts.<n>We evaluate nine popular LLMs on their ability to understand demographic differences in two subjective judgment tasks: politeness and offensiveness.<n>We find that in zero-shot settings, most models' predictions for both tasks align more closely with labels from White participants than those from Asian or Black participants.
arXiv Detail & Related papers (2023-11-16T10:02:24Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - The Ecological Fallacy in Annotation: Modelling Human Label Variation
goes beyond Sociodemographics [0.0]
Recent research aims to model individual annotator behaviour rather than predicting aggregated labels.
We introduce group-specific layers to multi-annotator models to account for sociodemographics.
This result shows that individual annotation behaviour depends on much more than just sociodemographics.
arXiv Detail & Related papers (2023-06-20T14:23:32Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - On the Limitations of Sociodemographic Adaptation with Transformers [34.768337465321395]
Sociodemographic factors (e.g., gender or age) shape our language.
Previous work showed that incorporating specific sociodemographic factors can consistently improve performance for various NLP tasks.
We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers.
arXiv Detail & Related papers (2022-08-01T17:58:02Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.