Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference
- URL: http://arxiv.org/abs/2503.05047v1
- Date: Thu, 06 Mar 2025 23:49:30 GMT
- Title: Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference
- Authors: Grace Proebsting, Adam Poliak,
- Abstract summary: We test whether NLP datasets created with Large Language Models (LLMs) contain annotation artifacts and social biases.<n>We recreate a portion of the Stanford Natural Language Inference corpus using GPT-4, Llama-2 70b for Chat, and Mistral 7b Instruct.
- Score: 3.0804372027733202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We test whether NLP datasets created with Large Language Models (LLMs) contain annotation artifacts and social biases like NLP datasets elicited from crowd-source workers. We recreate a portion of the Stanford Natural Language Inference corpus using GPT-4, Llama-2 70b for Chat, and Mistral 7b Instruct. We train hypothesis-only classifiers to determine whether LLM-elicited NLI datasets contain annotation artifacts. Next, we use pointwise mutual information to identify the words in each dataset that are associated with gender, race, and age-related terms. On our LLM-generated NLI datasets, fine-tuned BERT hypothesis-only classifiers achieve between 86-96% accuracy. Our analyses further characterize the annotation artifacts and stereotypical biases in LLM-generated datasets.
Related papers
- Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs)<n>We find that fine-tuning existing text embedding models on LLM-generated texts yields excellent classification accuracy.<n>We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z) - Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference [3.0804372027733202]
We recreate a portion of the Stanford NLI corpus using GPT-4, Llama-2 and Mistral 7b.
We train hypothesis-only classifiers to determine whether LLM-elicited hypotheses contain annotation artifacts.
Our analysis provides empirical evidence that well-attested biases in NLI can persist in LLM-generated data.
arXiv Detail & Related papers (2024-10-11T17:09:22Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Exploring Factual Entailment with NLI: A News Media Study [0.9208007322096533]
We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel.
Our analysis shows that 84% of factually supporting pairs and 63% of undermining factually pairs do not amount to NLI entailment or contradiction.
We experiment with models for pairwise classification on the new dataset, and find that in some cases, generating synthetic data with GPT-4 on the basis of the annotated dataset can improve performance.
arXiv Detail & Related papers (2024-06-24T17:47:55Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.<n>This survey delves into an important attribute of these datasets: the dialect of a language.<n>Motivated by the performance degradation of NLP models for dialectal datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - Native Language Identification with Large Language Models [60.80452362519818]
We show that GPT models are proficient at NLI classification, with GPT-4 setting a new performance record of 91.7% on the benchmark11 test set in a zero-shot setting.
We also show that unlike previous fully-supervised settings, LLMs can perform NLI without being limited to a set of known classes.
arXiv Detail & Related papers (2023-12-13T00:52:15Z) - Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating
Generalization Capacity of Language Models [18.874880342410876]
We present Jamp, a Japanese benchmark focused on temporal inference.
Our dataset includes a range of temporal inference patterns, which enables us to conduct fine-grained analysis.
We evaluate the generalization capacities of monolingual/multilingual LMs by splitting our dataset based on tense fragments.
arXiv Detail & Related papers (2023-06-19T07:00:14Z) - WANLI: Worker and AI Collaboration for Natural Language Inference
Dataset Creation [101.00109827301235]
We introduce a novel paradigm for dataset creation based on human and machine collaboration.
We use dataset cartography to automatically identify examples that demonstrate challenging reasoning patterns, and instruct GPT-3 to compose new examples with similar patterns.
The resulting dataset, WANLI, consists of 108,357 natural language inference (NLI) examples that present unique empirical strengths.
arXiv Detail & Related papers (2022-01-16T03:13:49Z) - Automatically Identifying Semantic Bias in Crowdsourced Natural Language
Inference Datasets [78.6856732729301]
We introduce a model-driven, unsupervised technique to find "bias clusters" in a learned embedding space of hypotheses in NLI datasets.
interventions and additional rounds of labeling can be performed to ameliorate the semantic bias of the hypothesis distribution of a dataset.
arXiv Detail & Related papers (2021-12-16T22:49:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.