Related papers: Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations

Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations

URL: http://arxiv.org/abs/2510.16458v1
Date: Sat, 18 Oct 2025 11:40:29 GMT
Title: Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations
Authors: Pingjun Hong, Beiduo Chen, Siyao Peng, Marie-Catherine de Marneffe, Benjamin Roth, Barbara Plank,
Abstract summary: Natural Language Inference datasets often exhibit human label variation.<n>One such approach is the LiTEx taxonomy, which categorizes free-text explanations in English into reasoning types.<n>This paper broadens the scope by examining how annotators may diverge not only in the reasoning type but also in the labeling step.
Score: 34.04363206545923
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Natural Language Inference datasets often exhibit human label variation. To better understand these variations, explanation-based approaches analyze the underlying reasoning behind annotators' decisions. One such approach is the LiTEx taxonomy, which categorizes free-text explanations in English into reasoning types. However, previous work applying such taxonomies has focused on within-label variation: cases where annotators agree on the final NLI label but provide different explanations. In contrast, this paper broadens the scope by examining how annotators may diverge not only in the reasoning type but also in the labeling step. We use explanations as a lens to decompose the reasoning process underlying NLI annotation and to analyze individual differences. We apply LiTEx to two NLI English datasets and align annotation variation from multiple aspects: NLI label agreement, explanation similarity, and taxonomy agreement, with an additional compounding factor of annotators' selection bias. We observe instances where annotators disagree on the label but provide highly similar explanations, suggesting that surface-level disagreement may mask underlying agreement in interpretation. Moreover, our analysis reveals individual preferences in explanation strategies and label choices. These findings highlight that agreement in reasoning types better reflects the semantic similarity of free-text explanations than label agreement alone. Our findings underscore the richness of reasoning-based explanations and the need for caution in treating labels as ground truth.

Related papers

Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation [60.18907916989796]
Large Language Models (LLMs) generate chains of thought (CoTs) before giving the final answer.<n>We propose a novel pipeline enriched with linguistically-grounded discourse segmenters to extract supporting and opposing statements for each answer option.<n>We also propose a rank-based HLV evaluation framework that prioritizes the ranking of answers over exact scores.
arXiv Detail & Related papers (2025-05-29T11:47:18Z)
LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference [34.648042619016394]
We introduce LITEX, a linguistically-informed taxonomy for categorizing free-text explanations in English.<n>Using this taxonomy, we annotate a subset of the e-SNLI dataset, validate the taxonomy's reliability, and analyze how it aligns with NLI labels, highlights, and explanations.
arXiv Detail & Related papers (2025-05-28T20:32:48Z)
Ecologically Valid Explanations for Label Variation in NLI [27.324994764803808]
We build LiveNLI, an English dataset of 1,415 ecologically valid explanations (annotators explain the NLI labels they chose) for 122 MNLI items. LiveNLI explanations confirm that people can systematically vary on their interpretation and highlight within-label variation. This suggests that explanations are crucial for navigating label interpretations in general.
arXiv Detail & Related papers (2023-10-20T22:52:19Z)
PropSegmEnt: A Large-Scale Corpus for Proposition-Level Segmentation and Entailment Recognition [63.51569687229681]
We argue for the need to recognize the textual entailment relation of each proposition in a sentence individually. We propose PropSegmEnt, a corpus of over 45K propositions annotated by expert human raters. Our dataset structure resembles the tasks of (1) segmenting sentences within a document to the set of propositions, and (2) classifying the entailment relation of each proposition with respect to a different yet topically-aligned document.
arXiv Detail & Related papers (2022-12-21T04:03:33Z)
Investigating Reasons for Disagreement in Natural Language Inference [9.002603216969154]
We develop a taxonomy of disagreement sources spanning 3 high-level classes. Some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts. We explore two modeling approaches for detecting items with potential disagreement.
arXiv Detail & Related papers (2022-09-07T18:01:39Z)
Weakly Supervised Explainable Phrasal Reasoning with Neural Fuzzy Logic [24.868479255640718]
Natural language inference aims to determine the logical relationship between two sentences among the target labels Entailment, Contradiction, and Neutral. Deep learning models have become a prevailing approach to NLI, but they lack interpretability and explainability. In this work, we address the explainability for NLI by weakly supervised logical reasoning.
arXiv Detail & Related papers (2021-09-18T13:04:23Z)
A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning [111.05365744744437]
Unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination. Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning.
arXiv Detail & Related papers (2021-06-28T14:24:52Z)
Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models. Our method is based on projecting model representation to a latent space. Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z)
Debiased Contrastive Learning [64.98602526764599]
We develop a debiased contrastive objective that corrects for the sampling of same-label datapoints. Empirically, the proposed objective consistently outperforms the state-of-the-art for representation learning in vision, language, and reinforcement learning benchmarks.
arXiv Detail & Related papers (2020-07-01T04:25:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.