Can Large Language Models Capture Dissenting Human Voices?
- URL: http://arxiv.org/abs/2305.13788v2
- Date: Fri, 27 Oct 2023 11:25:00 GMT
- Title: Can Large Language Models Capture Dissenting Human Voices?
- Authors: Noah Lee, Na Min An and James Thorne
- Abstract summary: Large language models (LLMs) have shown impressive achievements in solving a broad range of tasks.
We evaluate the performance and alignment of LLM distribution with humans using two different techniques.
We show LLMs exhibit limited ability in solving NLI tasks and simultaneously fail to capture human disagreement distribution.
- Score: 7.668954669688971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have shown impressive achievements in solving a
broad range of tasks. Augmented by instruction fine-tuning, LLMs have also been
shown to generalize in zero-shot settings as well. However, whether LLMs
closely align with the human disagreement distribution has not been
well-studied, especially within the scope of natural language inference (NLI).
In this paper, we evaluate the performance and alignment of LLM distribution
with humans using two different techniques to estimate the multinomial
distribution: Monte Carlo Estimation (MCE) and Log Probability Estimation
(LPE). As a result, we show LLMs exhibit limited ability in solving NLI tasks
and simultaneously fail to capture human disagreement distribution. The
inference and human alignment performances plunge even further on data samples
with high human disagreement levels, raising concerns about their natural
language understanding (NLU) ability and their representativeness to a larger
human population. The source code for the experiments is available at
https://github.com/xfactlab/emnlp2023-LLM-Disagreement
Related papers
- Bayesian Statistical Modeling with Predictors from LLMs [5.5711773076846365]
State of the art large language models (LLMs) have shown impressive performance on a variety of benchmark tasks.
This raises questions about the human-likeness of LLM-derived information.
arXiv Detail & Related papers (2024-06-13T11:33:30Z) - What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages [78.1866280652834]
Large language models (LM) are distributions over strings.
We investigate the learnability of regular LMs (RLMs) by RNN and Transformer LMs.
We find that the complexity of the RLM rank is strong and significant predictors of learnability for both RNNs and Transformers.
arXiv Detail & Related papers (2024-06-06T17:34:24Z) - Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach [0.0]
Large Language Models (LLMs) produce inaccurate outputs, also known as hallucinations.
This paper introduces a supervised learning approach employing only four numerical features derived from tokens and vocabulary probabilities obtained from other evaluators.
The method yields promising results, surpassing state-of-the-art outcomes in multiple tasks across three different benchmarks.
arXiv Detail & Related papers (2024-05-30T03:00:47Z) - YAYI 2: Multilingual Open-Source Large Language Models [53.92832054643197]
We propose YAYI 2, including both base and chat models, with 30 billion parameters.
YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline.
The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback.
arXiv Detail & Related papers (2023-12-22T17:34:47Z) - CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large
Language Models for Data Annotation [94.59630161324013]
We propose CoAnnotating, a novel paradigm for Human-LLM co-annotation of unstructured texts at scale.
Our empirical study shows CoAnnotating to be an effective means to allocate work from results on different datasets, with up to 21% performance improvement over random baseline.
arXiv Detail & Related papers (2023-10-24T08:56:49Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z) - Can Large Language Models Transform Computational Social Science? [79.62471267510963]
Large Language Models (LLMs) are capable of performing many language processing tasks zero-shot (without training data)
This work provides a road map for using LLMs as Computational Social Science tools.
arXiv Detail & Related papers (2023-04-12T17:33:28Z) - Benchmarking Large Language Models for News Summarization [79.37850439866938]
Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood.
We find instruction tuning, and not model size, is the key to the LLM's zero-shot summarization capability.
arXiv Detail & Related papers (2023-01-31T18:46:19Z) - Event knowledge in large language models: the gap between the impossible
and the unlikely [46.540380831486125]
We show that pre-trained large language models (LLMs) possess substantial event knowledge.
They almost always assign higher likelihood to possible vs. impossible events.
However, they show less consistent preferences for likely vs. unlikely events.
arXiv Detail & Related papers (2022-12-02T23:43:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.