Supervising the Transfer of Reasoning Patterns in VQA
- URL: http://arxiv.org/abs/2106.05597v1
- Date: Thu, 10 Jun 2021 08:58:43 GMT
- Title: Supervising the Transfer of Reasoning Patterns in VQA
- Authors: Corentin Kervadec, Christian Wolf, Grigory Antipov, Moez Baccouche and
Madiha Nadri
- Abstract summary: Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset biases rather than performing reasoning.
We propose a method for knowledge transfer based on a regularization term in our loss function, supervising the sequence of required reasoning operations.
We also demonstrate the effectiveness of this approach experimentally on the GQA dataset and show its complement to BERT-like self-supervised pre-training.
- Score: 9.834885796317971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Methods for Visual Question Anwering (VQA) are notorious for leveraging
dataset biases rather than performing reasoning, hindering generalization. It
has been recently shown that better reasoning patterns emerge in attention
layers of a state-of-the-art VQA model when they are trained on perfect
(oracle) visual inputs. This provides evidence that deep neural networks can
learn to reason when training conditions are favorable enough. However,
transferring this learned knowledge to deployable models is a challenge, as
much of it is lost during the transfer. We propose a method for knowledge
transfer based on a regularization term in our loss function, supervising the
sequence of required reasoning operations. We provide a theoretical analysis
based on PAC-learning, showing that such program prediction can lead to
decreased sample complexity under mild hypotheses. We also demonstrate the
effectiveness of this approach experimentally on the GQA dataset and show its
complementarity to BERT-like self-supervised pre-training.
Related papers
- Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model [38.79241114146971]
We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
arXiv Detail & Related papers (2024-06-14T13:24:32Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - How Transferable are Reasoning Patterns in VQA? [10.439369423744708]
We argue that uncertainty in vision is a dominating factor preventing the successful learning of reasoning in vision and language problems.
We train a visual oracle and in a large scale study provide experimental evidence that it is much less prone to exploiting spurious dataset biases.
We exploit these insights by transferring reasoning patterns from the oracle to a SOTA Transformer-based VQA model taking standard noisy visual inputs via fine-tuning.
arXiv Detail & Related papers (2021-04-08T10:18:45Z) - Explain by Evidence: An Explainable Memory-based Neural Network for
Question Answering [41.73026155036886]
This paper proposes an explainable, evidence-based memory network architecture.
It learns to summarize the dataset and extract supporting evidences to make its decision.
Our model achieves state-of-the-art performance on two popular question answering datasets.
arXiv Detail & Related papers (2020-11-05T21:18:21Z) - Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a
Class-imbalance View [129.392671317356]
We propose to interpret the language prior problem in VQA from a class-imbalance view.
It explicitly reveals why the VQA model tends to produce a frequent yet obviously wrong answer.
We also justify the validity of the class imbalance interpretation scheme on other computer vision tasks, such as face recognition and image classification.
arXiv Detail & Related papers (2020-10-30T00:57:17Z) - MUTANT: A Training Paradigm for Out-of-Distribution Generalization in
Visual Question Answering [58.30291671877342]
We present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input.
MUTANT establishes a new state-of-the-art accuracy on VQA-CP with a $10.57%$ improvement.
arXiv Detail & Related papers (2020-09-18T00:22:54Z) - DeVLBert: Learning Deconfounded Visio-Linguistic Representations [111.93480424791613]
We investigate the problem of out-of-domain visio-linguistic pretraining.
Existing methods for this problem are purely likelihood-based.
We propose a Decon-Linguistic Bert framework, abbreviated as DeVLBert, to perform intervention-based learning.
arXiv Detail & Related papers (2020-08-16T11:09:22Z) - DisCor: Corrective Feedback in Reinforcement Learning via Distribution
Correction [96.90215318875859]
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from corrective feedback.
We propose a new algorithm, DisCor, which computes an approximation to this optimal distribution and uses it to re-weight the transitions used for training.
arXiv Detail & Related papers (2020-03-16T16:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.