A Data Augmentation Method by Mixing Up Negative Candidate Answers for
Solving Raven's Progressive Matrices
- URL: http://arxiv.org/abs/2103.05222v1
- Date: Tue, 9 Mar 2021 04:50:32 GMT
- Title: A Data Augmentation Method by Mixing Up Negative Candidate Answers for
Solving Raven's Progressive Matrices
- Authors: Wentao He, Jialu Zhang, Chenglin Yao, Shihe Wang, Jianfeng Ren, Ruibin
Bai
- Abstract summary: Raven's Progressive Matrices ( RPMs) are frequently-used in testing human's visual reasoning ability.
Recent developed RPM-like datasets and solution models transfer this kind of problems from cognitive science to computer science.
We propose a data augmentation strategy by image mix-up, which is generalizable to a variety of multiple-choice problems.
- Score: 0.829949723558878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Raven's Progressive Matrices (RPMs) are frequently-used in testing human's
visual reasoning ability. Recently developed RPM-like datasets and solution
models transfer this kind of problems from cognitive science to computer
science. In view of the poor generalization performance due to insufficient
samples in RPM datasets, we propose a data augmentation strategy by image
mix-up, which is generalizable to a variety of multiple-choice problems,
especially for image-based RPM-like problems. By focusing on potential
functionalities of negative candidate answers, the visual reasoning capability
of the model is enhanced. By applying the proposed data augmentation method, we
achieve significant and consistent improvement on various RPM-like datasets
compared with the state-of-the-art models.
Related papers
- Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models [1.9001431325800364]
Multimodal foundation models (MFMs) such as OFASys show the potential to unlock analysis of complex data via text prompts alone.
Their performance may suffer in the face of text input that differs even slightly from their training distribution.
This study demonstrates that prompt instability is a major concern for MFMs, leading to a consistent drop in performance across all modalities.
arXiv Detail & Related papers (2024-08-26T19:26:55Z) - Ambient Diffusion Posterior Sampling: Solving Inverse Problems with
Diffusion Models trained on Corrupted Data [56.81246107125692]
Ambient Diffusion Posterior Sampling (A-DPS) is a generative model pre-trained on one type of corruption.
We show that A-DPS can sometimes outperform models trained on clean data for several image restoration tasks in both speed and performance.
We extend the Ambient Diffusion framework to train MRI models with access only to Fourier subsampled multi-coil MRI measurements.
arXiv Detail & Related papers (2024-03-13T17:28:20Z) - On Calibrating Diffusion Probabilistic Models [78.75538484265292]
diffusion probabilistic models (DPMs) have achieved promising results in diverse generative tasks.
We propose a simple way for calibrating an arbitrary pretrained DPM, with which the score matching loss can be reduced and the lower bounds of model likelihood can be increased.
Our calibration method is performed only once and the resulting models can be used repeatedly for sampling.
arXiv Detail & Related papers (2023-02-21T14:14:40Z) - RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging [62.315673415889314]
This paper proposes a deep recurrent Rotation Averaging Graph (RAGO) for Multiple Rotation Averaging (MRA)
Our framework is a real-time learning-to-optimize rotation averaging graph with a tiny size deployed for real-world applications.
arXiv Detail & Related papers (2022-12-14T13:19:40Z) - Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method.
We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z) - One-shot Visual Reasoning on RPMs with an Application to Video Frame
Prediction [1.0932251830449902]
Raven's Progressive Matrices (RPMs) are frequently used in evaluating human's visual reasoning ability.
We propose a One-shot Human-Understandable ReaSoner (Os-HURS) to tackle the challenges of real-world visual recognition and subsequent logical reasoning tasks.
arXiv Detail & Related papers (2021-11-24T06:51:38Z) - Distributionally Robust Multi-Output Regression Ranking [3.9318191265352196]
We introduce a new listwise listwise learning-to-rank model called Distributionally Robust Multi-output Regression Ranking (DRMRR)
DRMRR uses a Distributionally Robust Optimization framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution.
Our experiments were conducted on two real-world applications, medical document retrieval, and drug response prediction.
arXiv Detail & Related papers (2021-09-27T05:19:27Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Multi-Label Contrastive Learning for Abstract Visual Reasoning [0.0]
State-of-the-art systems solving Raven's Progressive Matrices rely on massive pattern-based training and exploiting biases in the dataset.
Humans concentrate on identification of the rules / concepts underlying the RPM (or generally a visual reasoning task) to be solved.
We propose a new sparse rule encoding scheme for RPMs which, besides the new training algorithm, is the key factor contributing to the state-of-the-art performance.
arXiv Detail & Related papers (2020-12-03T14:18:15Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.