Related papers: Finding Dataset Shortcuts with Grammar Induction

Finding Dataset Shortcuts with Grammar Induction

URL: http://arxiv.org/abs/2210.11560v1
Date: Thu, 20 Oct 2022 19:54:11 GMT
Title: Finding Dataset Shortcuts with Grammar Induction
Authors: Dan Friedman, Alexander Wettig, Danqi Chen
Abstract summary: We propose to use probabilistic grammars to characterize and discover shortcuts in NLP datasets. Specifically, we use a context-free grammar to model patterns in sentence classification datasets and use a synchronous context-free grammar to model datasets involving sentence pairs. The resulting grammars reveal interesting shortcut features in a number of datasets, including both simple and high-level features.
Score: 85.47127659108637
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many NLP datasets have been found to contain shortcuts: simple decision rules that achieve surprisingly high accuracy. However, it is difficult to discover shortcuts automatically. Prior work on automatic shortcut detection has focused on enumerating features like unigrams or bigrams, which can find only low-level shortcuts, or relied on post-hoc model interpretability methods like saliency maps, which reveal qualitative patterns without a clear statistical interpretation. In this work, we propose to use probabilistic grammars to characterize and discover shortcuts in NLP datasets. Specifically, we use a context-free grammar to model patterns in sentence classification datasets and use a synchronous context-free grammar to model datasets involving sentence pairs. The resulting grammars reveal interesting shortcut features in a number of datasets, including both simple and high-level features, and automatically identify groups of test examples on which conventional classifiers fail. Finally, we show that the features we discover can be used to generate diagnostic contrast examples and incorporated into standard robust optimization methods to improve worst-group accuracy.

Related papers

Explaining Datasets in Words: Statistical Models with Natural Language Parameters [66.69456696878842]
We introduce a family of statistical models -- including clustering, time series, and classification models -- parameterized by natural language predicates. We apply our framework to a wide range of problems: taxonomizing user chat dialogues, characterizing how they evolve across time, finding categories where one language model is better than the other.
arXiv Detail & Related papers (2024-09-13T01:40:20Z)
Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation [7.5856806269316825]
Weakly supervised semantic segmentation (WSSS) employing weak forms of labels has been actively studied to alleviate the annotation cost of acquiring pixel-level labels. We propose shortcut mitigating augmentation (SMA) for WSSS, which generates synthetic representations of object-background combinations not seen in the training data to reduce the use of shortcut features.
arXiv Detail & Related papers (2024-05-28T13:07:35Z)
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data. However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z)
Understanding and Mitigating Classification Errors Through Interpretable Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors. We propose to discover those patterns of tokens that distinguish correct and erroneous predictions. We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints. This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks. Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z)
Automatic Language Identification for Celtic Texts [0.0]
This work addresses the identification of the related low-resource languages on the example of the Celtic language family. We collected a new dataset including Irish, Scottish, Welsh and English records. We tested supervised models such as SVM and neural networks with traditional statistical features alongside the output of clustering, autoencoder, and topic modelling methods.
arXiv Detail & Related papers (2022-03-09T16:04:13Z)
Label-Descriptive Patterns and their Application to Characterizing Classification Errors [31.272875287136426]
State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a model is prone to making systematic errors, but also gives a way to act and improve the model. In this paper we propose a method that allows us to do so for arbitrary classifiers by mining a small set of patterns that together succinctly describe the input data that is partitioned according to correctness of prediction.
arXiv Detail & Related papers (2021-10-18T19:42:21Z)
Low-rank Dictionary Learning for Unsupervised Feature Selection [11.634317251468968]
We introduce a novel unsupervised feature selection approach by applying dictionary learning ideas in a low-rank representation. A unified objective function for unsupervised feature selection is proposed in a sparse way by an $ell_2,1$-norm regularization. Our experimental findings reveal that the proposed method outperforms the state-of-the-art algorithm.
arXiv Detail & Related papers (2021-06-21T13:39:10Z)
Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags. Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.