Automatic Counterfactual Augmentation for Robust Text Classification
Based on Word-Group Search
- URL: http://arxiv.org/abs/2307.01214v1
- Date: Sat, 1 Jul 2023 02:26:34 GMT
- Title: Automatic Counterfactual Augmentation for Robust Text Classification
Based on Word-Group Search
- Authors: Rui Song, Fausto Giunchiglia, Yingji Li, Hao Xu
- Abstract summary: In general, a keyword is regarded as a shortcut if it creates a superficial association with the label, resulting in a false prediction.
We propose a new Word-Group mining approach, which captures the causal effect of any keyword combination and orders the combinations that most affect the prediction.
Our approach bases on effective post-hoc analysis and beam search, which ensures the mining effect and reduces the complexity.
- Score: 12.894936637198471
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite large-scale pre-trained language models have achieved striking
results for text classificaion, recent work has raised concerns about the
challenge of shortcut learning. In general, a keyword is regarded as a shortcut
if it creates a superficial association with the label, resulting in a false
prediction. Conversely, shortcut learning can be mitigated if the model relies
on robust causal features that help produce sound predictions. To this end,
many studies have explored post-hoc interpretable methods to mine shortcuts and
causal features for robustness and generalization. However, most existing
methods focus only on single word in a sentence and lack consideration of
word-group, leading to wrong causal features. To solve this problem, we propose
a new Word-Group mining approach, which captures the causal effect of any
keyword combination and orders the combinations that most affect the
prediction. Our approach bases on effective post-hoc analysis and beam search,
which ensures the mining effect and reduces the complexity. Then, we build a
counterfactual augmentation method based on the multiple word-groups, and use
an adaptive voting mechanism to learn the influence of different augmentated
samples on the prediction results, so as to force the model to pay attention to
effective causal features. We demonstrate the effectiveness of the proposed
method by several tasks on 8 affective review datasets and 4 toxic language
datasets, including cross-domain text classificaion, text attack and gender
fairness test.
Related papers
- CAST: Corpus-Aware Self-similarity Enhanced Topic modelling [16.562349140796115]
We introduce CAST: Corpus-Aware Self-similarity Enhanced Topic modelling, a novel topic modelling method.
We find self-similarity to be an effective metric to prevent functional words from acting as candidate topic words.
Our approach significantly enhances the coherence and diversity of generated topics, as well as the topic model's ability to handle noisy data.
arXiv Detail & Related papers (2024-10-19T15:27:11Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input.
We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise.
We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Embedding Compression for Text Classification Using Dictionary Screening [8.308609870092884]
We propose a dictionary screening method for embedding compression in text classification tasks.
The proposed method leads to significant reductions in terms of parameters, average text sequence, and dictionary size.
arXiv Detail & Related papers (2022-11-23T05:32:13Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework
Unifying Hybrid Granularities for Text Generation [59.01297461453444]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious
Feature-Label Correlation [44.319739489968164]
Deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks.
In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution.
We propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly.
arXiv Detail & Related papers (2022-05-25T09:08:35Z) - Learning-based Hybrid Local Search for the Hard-label Textual Attack [53.92227690452377]
We consider a rarely investigated but more rigorous setting, namely hard-label attack, in which the attacker could only access the prediction label.
Based on this observation, we propose a novel hard-label attack, called Learning-based Hybrid Local Search (LHLS) algorithm.
Our LHLS significantly outperforms existing hard-label attacks regarding the attack performance as well as adversary quality.
arXiv Detail & Related papers (2022-01-20T14:16:07Z) - Experiments with adversarial attacks on text genres [0.0]
Neural models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
We show that embedding-based algorithms which can replace some of the most significant'' words with words similar to them, have the ability to influence model predictions in a significant proportion of cases.
arXiv Detail & Related papers (2021-07-05T19:37:59Z) - Improved and Efficient Text Adversarial Attacks using Target Information [34.50272230153329]
A growing interest in studying adversarial examples on natural language models in the black-box setting.
New approach was introduced that addresses this problem through interpretable learning to learn the word ranking instead of previous expensive search.
Main advantage of using this approach is that it achieves comparable attack rates to the state-of-the-art methods, yet faster and with fewer queries.
arXiv Detail & Related papers (2021-04-27T21:25:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.