Improving Data Augmentation for Robust Visual Question Answering with
Effective Curriculum Learning
- URL: http://arxiv.org/abs/2401.15646v1
- Date: Sun, 28 Jan 2024 12:48:16 GMT
- Title: Improving Data Augmentation for Robust Visual Question Answering with
Effective Curriculum Learning
- Authors: Yuhang Zheng, Zhen Wang, Long Chen
- Abstract summary: We design an Effective Curriculum Learning strategy ECL to enhance DA-based VQA methods.
ECL trains VQA models on relatively easy'' samples first, and then gradually changes to harder'' samples, and less-valuable samples are dynamically removed.
Compared to training on the entire augmented dataset, our ECL strategy can further enhance VQA models' performance with fewer training samples.
- Score: 12.647353699551081
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Being widely used in learning unbiased visual question answering (VQA)
models, Data Augmentation (DA) helps mitigate language biases by generating
extra training samples beyond the original samples. While today's DA methods
can generate robust samples, the augmented training set, significantly larger
than the original dataset, often exhibits redundancy in terms of difficulty or
content repetition, leading to inefficient model training and even compromising
the model performance. To this end, we design an Effective Curriculum Learning
strategy ECL to enhance DA-based VQA methods. Intuitively, ECL trains VQA
models on relatively ``easy'' samples first, and then gradually changes to
``harder'' samples, and less-valuable samples are dynamically removed. Compared
to training on the entire augmented dataset, our ECL strategy can further
enhance VQA models' performance with fewer training samples. Extensive
ablations have demonstrated the effectiveness of ECL on various methods.
Related papers
- Probing Perfection: The Relentless Art of Meddling for Pulmonary Airway Segmentation from HRCT via a Human-AI Collaboration Based Active Learning Method [13.384578466263566]
In pulmonary tracheal segmentation, the scarcity of annotated data is a prevalent issue.
Deep Learning (DL) methods face challenges: the opacity of 'black box' models and the need for performance enhancement.
We address these challenges by combining diverse query strategies with various DL models.
arXiv Detail & Related papers (2024-07-03T23:27:53Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning [47.02160072880698]
We introduce a self-evolving mechanism that allows the model itself to actively sample subsets that are equally or even more effective.
The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets.
Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol.
arXiv Detail & Related papers (2023-11-14T14:10:40Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - Core-set Selection Using Metrics-based Explanations (CSUME) for
multiclass ECG [2.0520503083305073]
We show how a selection of good quality data improves deep learning model performance.
Our experimental results show a 9.67% and 8.69% precision and recall improvement with a significant training data volume reduction of 50%.
arXiv Detail & Related papers (2022-05-28T19:36:28Z) - CCLF: A Contrastive-Curiosity-Driven Learning Framework for
Sample-Efficient Reinforcement Learning [56.20123080771364]
We develop a model-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF) for reinforcement learning.
CCLF fully exploit sample importance and improve learning efficiency in a self-supervised manner.
We evaluate this approach on the DeepMind Control Suite, Atari, and MiniGrid benchmarks.
arXiv Detail & Related papers (2022-05-02T14:42:05Z) - Learn by Challenging Yourself: Contrastive Visual Representation
Learning with Hard Sample Generation [16.3860181959878]
We propose a framework with two approaches to improve the data efficiency of Contrastive Learning (CL) training.
The first approach generates hard samples for the main model.
The generator is jointly learned with the main model to dynamically customize hard samples.
In joint learning, the hardness of a positive pair is progressively increased by decreasing their similarity.
arXiv Detail & Related papers (2022-02-14T02:41:43Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.