Related papers: One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity

One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity

URL: http://arxiv.org/abs/2411.04427v2
Date: Tue, 12 Nov 2024 20:11:58 GMT
Title: One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
Authors: Sonia K. Murthy, Tomer Ullman, Jennifer Hu,
Abstract summary: Researchers have proposed using large language models (LLMs) as replacements for humans in behavioral research. It is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity. We use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability.
Score: 2.5975241792179378
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Researchers in social science and psychology have recently proposed using large language models (LLMs) as replacements for humans in behavioral research. In addition to arguments about whether LLMs accurately capture population-level patterns, this has raised questions about whether LLMs capture human-like conceptual diversity. Separately, it is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity. Inspired by human studies, we use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability. We use this approach to evaluate non-aligned and aligned LLMs on two domains with rich human behavioral data. While no model reaches human-like diversity, aligned models generally display less diversity than their instruction fine-tuned counterparts. Our findings highlight potential trade-offs between increasing models' value alignment and decreasing the diversity of their conceptual representations.

Related papers

Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task [0.0]
Large language models (LLMs) are increasingly explored as substitutes for human participants in cognitive tasks.<n>This study examines whether LLMs can approximate individual differences in the phonemic fluency task.
arXiv Detail & Related papers (2025-05-22T03:08:27Z)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning [52.32745233116143]
Humans organize knowledge into compact categories through semantic compression.<n>Large Language Models (LLMs) demonstrate remarkable linguistic abilities.<n>But whether their internal representations strike a human-like trade-off between compression and semantic fidelity is unclear.
arXiv Detail & Related papers (2025-05-21T16:29:00Z)
Evaluating the Diversity and Quality of LLM Generated Content [72.84945252821908]
We introduce a framework for measuring effective semantic diversity--diversity among outputs that meet quality thresholds. Although preference-tuned models exhibit reduced lexical and syntactic diversity, they produce greater effective semantic diversity than SFT or base models. These findings have important implications for applications that require diverse yet high-quality outputs.
arXiv Detail & Related papers (2025-04-16T23:02:23Z)
Mixture-of-Personas Language Models for Population Simulation [20.644911871150136]
Large Language Models (LLMs) can augment human-generated data in social science research and machine learning model training. MoP is a contextual mixture model, where each component is an LM agent characterized by a persona and an exemplar representing subpopulation behaviors. MoP is flexible, requires no model finetuning, and is transferable across base models.
arXiv Detail & Related papers (2025-04-07T12:43:05Z)
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models [36.983534612895156]
In the recent past, a popular way of evaluating natural language understanding (NLU) was to consider a model's ability to perform natural language inference (NLI) tasks. This paper focuses on five different NLI benchmarks across six models of different scales. We investigate if they are able to discriminate models of different size and quality and how their accuracies develop during training.
arXiv Detail & Related papers (2024-11-21T13:09:36Z)
Large Language Models Reflect the Ideology of their Creators [73.25935570218375]
Large language models (LLMs) are trained on vast amounts of data to generate natural language. We uncover notable diversity in the ideological stance exhibited across different LLMs and languages.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
Virtual Personas for Language Models via an Anthology of Backstories [5.2112564466740245]
"Anthology" is a method for conditioning large language models to particular virtual personas by harnessing open-ended life narratives. We show that our methodology enhances the consistency and reliability of experimental outcomes while ensuring better representation of diverse sub-populations.
arXiv Detail & Related papers (2024-07-09T06:11:18Z)
High-Dimension Human Value Representation in Large Language Models [60.33033114185092]
We propose UniVaR, a high-dimensional representation of human value distributions in Large Language Models (LLMs) We show that UniVaR is a powerful tool to compare the distribution of human values embedded in different LLMs with different langauge sources.
arXiv Detail & Related papers (2024-04-11T16:39:00Z)
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content. We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z)
On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)
Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting [19.79214899011072]
This paper formalizes diversity of representation in generative large language models. We present evaluation datasets and propose metrics to measure diversity in generated responses along people and culture axes. We find that LLMs understand the notion of diversity, and that they can reason and critique their own responses for that goal.
arXiv Detail & Related papers (2023-10-25T10:17:17Z)
Large Language Models as Superpositions of Cultural Perspectives [25.114678091641935]
Large Language Models (LLMs) are often misleadingly recognized as having a personality or a set of values. We argue that an LLM can be seen as a superposition of perspectives with different values and personality traits.
arXiv Detail & Related papers (2023-07-15T19:04:33Z)
Source-free Domain Adaptation Requires Penalized Diversity [60.04618512479438]
Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. We propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors.
arXiv Detail & Related papers (2023-04-06T00:20:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.