Reducing and Exploiting Data Augmentation Noise through Meta Reweighting
Contrastive Learning for Text Classification
- URL: http://arxiv.org/abs/2409.17474v1
- Date: Thu, 26 Sep 2024 02:19:13 GMT
- Title: Reducing and Exploiting Data Augmentation Noise through Meta Reweighting
Contrastive Learning for Text Classification
- Authors: Guanyi Mou, Yichuan Li, Kyumin Lee
- Abstract summary: We propose a novel framework to boost deep learning models' performance given augmented data/samples in text classification tasks.
We propose novel weight-dependent enqueue and dequeue algorithms to utilize augmented samples' weight/quality information effectively.
Our framework achieves an average of 1.6%, up to 4.3% absolute improvement on Text-CNN encoders and an average of 1.4%, up to 4.4% absolute improvement on RoBERTa-base encoders.
- Score: 3.9889306957591755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data augmentation has shown its effectiveness in resolving the data-hungry
problem and improving model's generalization ability. However, the quality of
augmented data can be varied, especially compared with the raw/original data.
To boost deep learning models' performance given augmented data/samples in text
classification tasks, we propose a novel framework, which leverages both meta
learning and contrastive learning techniques as parts of our design for
reweighting the augmented samples and refining their feature representations
based on their quality. As part of the framework, we propose novel
weight-dependent enqueue and dequeue algorithms to utilize augmented samples'
weight/quality information effectively. Through experiments, we show that our
framework can reasonably cooperate with existing deep learning models (e.g.,
RoBERTa-base and Text-CNN) and augmentation techniques (e.g., Wordnet and
Easydata) for specific supervised learning tasks. Experiment results show that
our framework achieves an average of 1.6%, up to 4.3% absolute improvement on
Text-CNN encoders and an average of 1.4%, up to 4.4% absolute improvement on
RoBERTa-base encoders on seven GLUE benchmark datasets compared with the best
baseline. We present an indepth analysis of our framework design, revealing the
non-trivial contributions of our network components. Our code is publicly
available for better reproducibility.
Related papers
- Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes [0.22499166814992438]
We present a novel framework, called AD-Net, aiming to enhance deep neural network performance on this challenge.
Specifically, our approach is designed to refine learned features through self-distillation on augmented samples, mitigating harmful overfitting.
With the smallest data available, our framework shows an outstanding relative accuracy increase of up to 45 %.
arXiv Detail & Related papers (2024-06-28T10:45:25Z) - Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback [110.16220825629749]
Learning from preference feedback has emerged as an essential step for improving the generation quality and performance of modern language models.
In this work, we identify four core aspects of preference-based learning: preference data, learning algorithm, reward model, and policy training prompts.
Our findings indicate that all aspects are important for performance, with better preference data leading to the largest improvements.
arXiv Detail & Related papers (2024-06-13T16:17:21Z) - An Integrated Data Processing Framework for Pretraining Foundation Models [57.47845148721817]
Researchers and practitioners often have to manually curate datasets from difference sources.
We propose a data processing framework that integrates a Processing Module and an Analyzing Module.
The proposed framework is easy to use and highly flexible.
arXiv Detail & Related papers (2024-02-26T07:22:51Z) - Feedback-guided Data Synthesis for Imbalanced Classification [10.836265321046561]
We introduce a framework for augmenting static datasets with useful synthetic samples.
We find that the samples must be close to the support of the real data of the task at hand, and be sufficiently diverse.
On ImageNet-LT, we achieve state-of-the-art results, with over 4 percent improvement on underrepresented classes.
arXiv Detail & Related papers (2023-09-29T21:47:57Z) - Gradient-Boosted Based Structured and Unstructured Learning [18.76745359031975]
We propose two frameworks to deal with problem settings in which both structured and unstructured data are available.
Our proposed frameworks allow joint learning on both kinds of data by integrating the paradigms of boosting models and deep neural networks.
arXiv Detail & Related papers (2023-02-28T04:16:42Z) - Learning Customized Visual Models with Retrieval-Augmented Knowledge [104.05456849611895]
We propose REACT, a framework to acquire the relevant web knowledge to build customized visual models for target domains.
We retrieve the most relevant image-text pairs from the web-scale database as external knowledge, and propose to customize the model by only training new modualized blocks while freezing all the original weights.
The effectiveness of REACT is demonstrated via extensive experiments on classification, retrieval, detection and segmentation tasks, including zero, few, and full-shot settings.
arXiv Detail & Related papers (2023-01-17T18:59:06Z) - Data-Efficient Augmentation for Training Neural Networks [15.870155099135538]
We propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation.
Our method achieves 6.3x speedup on CIFAR10 and 2.2x speedup on SVHN, and outperforms the baselines by up to 10% across various subset sizes.
arXiv Detail & Related papers (2022-10-15T19:32:20Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Generalized Reinforcement Meta Learning for Few-Shot Optimization [3.7675996866306845]
We present a generic and flexible Reinforcement Learning (RL) based meta-learning framework for the problem of few-shot learning.
Our framework could be easily extended to do network architecture search.
arXiv Detail & Related papers (2020-05-04T03:21:05Z) - COLAM: Co-Learning of Deep Neural Networks and Soft Labels via
Alternating Minimization [60.07531696857743]
Co-Learns DNNs and soft labels through Alternating Minimization of two objectives.
We propose COLAM framework that Co-Learns DNNs and soft labels through Alternating Minimization of two objectives.
arXiv Detail & Related papers (2020-04-26T17:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.