Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis
- URL: http://arxiv.org/abs/2106.15231v2
- Date: Wed, 30 Jun 2021 04:56:27 GMT
- Title: Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis
- Authors: Linyi Yang, Jiazheng Li, P\'adraig Cunningham, Yue Zhang, Barry Smyth,
Ruihai Dong
- Abstract summary: We propose an approach to automatically generating counterfactual data for data augmentation and explanation.
A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance.
- Score: 17.811597734603144
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While state-of-the-art NLP models have been achieving the excellent
performance of a wide range of tasks in recent years, important questions are
being raised about their robustness and their underlying sensitivity to
systematic biases that may exist in their training and test data. Such issues
come to be manifest in performance problems when faced with out-of-distribution
data in the field. One recent solution has been to use counterfactually
augmented datasets in order to reduce any reliance on spurious patterns that
may exist in the original data. Producing high-quality augmented data can be
costly and time-consuming as it usually needs to involve human feedback and
crowdsourcing efforts. In this work, we propose an alternative by describing
and evaluating an approach to automatically generating counterfactual data for
data augmentation and explanation. A comprehensive evaluation on several
different datasets and using a variety of state-of-the-art benchmarks
demonstrate how our approach can achieve significant improvements in model
performance when compared to models training on the original data and even when
compared to models trained with the benefit of human-generated augmented data.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Plots Unlock Time-Series Understanding in Multimodal Models [5.792074027074628]
This paper proposes a method that leverages the existing vision encoders of multimodal foundation models to "see" time-series data via plots.
Our empirical evaluations show that this approach outperforms providing the raw time-series data as text.
To demonstrate generalizability from synthetic tasks with clear reasoning steps to more complex, real-world scenarios, we apply our approach to consumer health tasks.
arXiv Detail & Related papers (2024-10-03T16:23:13Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - TRIAGE: Characterizing and auditing training data for improved
regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors.
TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score.
We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z) - RLBoost: Boosting Supervised Models using Deep Reinforcement Learning [0.0]
We present RLBoost, an algorithm that uses deep reinforcement learning strategies to evaluate a particular dataset and obtain a model capable of estimating the quality of any new data.
The results of the article show that this model obtains better and more stable results than other state-of-the-art algorithms such as LOO, DataShapley or DVRL.
arXiv Detail & Related papers (2023-05-23T14:38:33Z) - Phased Data Augmentation for Training a Likelihood-Based Generative Model with Limited Data [0.0]
Generative models excel in creating realistic images, yet their dependency on extensive datasets for training presents significant challenges.
Current data-efficient methods largely focus on GAN architectures, leaving a gap in training other types of generative models.
"phased data augmentation" is a novel technique that addresses this gap by optimizing training in limited data scenarios without altering the inherent data distribution.
arXiv Detail & Related papers (2023-05-22T03:38:59Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - MAIN: Multihead-Attention Imputation Networks [4.427447378048202]
We propose a novel mechanism based on multi-head attention which can be applied effortlessly in any model.
Our method inductively models patterns of missingness in the input data in order to increase the performance of the downstream task.
arXiv Detail & Related papers (2021-02-10T13:50:02Z) - DQI: Measuring Data Quality in NLP [22.54066527822898]
We introduce a generic formula for Data Quality Index (DQI) to help dataset creators create datasets free of unwanted biases.
We show that models trained on the renovated SNLI dataset generalize better to out of distribution tasks.
arXiv Detail & Related papers (2020-05-02T12:34:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.