A Competence-aware Curriculum for Visual Concepts Learning via Question
Answering
- URL: http://arxiv.org/abs/2007.01499v2
- Date: Mon, 27 Jul 2020 21:57:39 GMT
- Title: A Competence-aware Curriculum for Visual Concepts Learning via Question
Answering
- Authors: Qing Li, Siyuan Huang, Yining Hong, Song-Chun Zhu
- Abstract summary: We propose a competence-aware curriculum for visual concept learning in a question-answering manner.
We design a neural-symbolic concept learner for learning the visual concepts and a multi-dimensional Item Response Theory (mIRT) model for guiding the learning process.
Experimental results on CLEVR show that with a competence-aware curriculum, the proposed method achieves state-of-the-art performances.
- Score: 95.35905804211698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans can progressively learn visual concepts from easy to hard questions.
To mimic this efficient learning ability, we propose a competence-aware
curriculum for visual concept learning in a question-answering manner.
Specifically, we design a neural-symbolic concept learner for learning the
visual concepts and a multi-dimensional Item Response Theory (mIRT) model for
guiding the learning process with an adaptive curriculum. The mIRT effectively
estimates the concept difficulty and the model competence at each learning step
from accumulated model responses. The estimated concept difficulty and model
competence are further utilized to select the most profitable training samples.
Experimental results on CLEVR show that with a competence-aware curriculum, the
proposed method achieves state-of-the-art performances with superior data
efficiency and convergence speed. Specifically, the proposed model only uses
40% of training data and converges three times faster compared with other
state-of-the-art methods.
Related papers
- Restyling Unsupervised Concept Based Interpretable Networks with Generative Models [14.604305230535026]
We propose a novel method that relies on mapping the concept features to the latent space of a pretrained generative model.
We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts.
arXiv Detail & Related papers (2024-07-01T14:39:41Z) - Enhancing Large Vision Language Models with Self-Training on Image Comprehension [99.9389737339175]
We introduce Self-Training on Image (STIC), which emphasizes a self-training approach specifically for image comprehension.
First, the model self-constructs a preference for image descriptions using unlabeled images.
To further self-improve reasoning on the extracted visual information, we let the model reuse a small portion of existing instruction-tuning data.
arXiv Detail & Related papers (2024-05-30T05:53:49Z) - Curriculum Learning for Graph Neural Networks: A Multiview
Competence-based Approach [12.335698325757491]
We propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms.
The proposed solution advances existing research in curriculum learning for graph neural networks with the ability to incorporate a fine-grained spectrum of graph difficulty criteria.
arXiv Detail & Related papers (2023-07-17T21:33:35Z) - See, Think, Confirm: Interactive Prompting Between Vision and Language
Models for Knowledge-based Visual Reasoning [60.43585179885355]
We propose a novel framework named Interactive Prompting Visual Reasoner (IPVR) for few-shot knowledge-based visual reasoning.
IPVR contains three stages, see, think and confirm.
We conduct experiments on a range of knowledge-based visual reasoning datasets.
arXiv Detail & Related papers (2023-01-12T18:59:50Z) - Collaboration of Pre-trained Models Makes Better Few-shot Learner [49.89134194181042]
Few-shot classification requires deep neural networks to learn generalized representations only from limited training images.
Recently, CLIP-based methods have shown promising few-shot performance benefited from the contrastive language-image pre-training.
We propose CoMo, a Collaboration of pre-trained Models that incorporates diverse prior knowledge from various pre-training paradigms for better few-shot learning.
arXiv Detail & Related papers (2022-09-25T16:23:12Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - RLTutor: Reinforcement Learning Based Adaptive Tutoring System by
Modeling Virtual Student with Fewer Interactions [10.34673089426247]
We propose a framework for optimizing teaching strategies by constructing a virtual model of the student.
Our results can serve as a buffer between theoretical instructional optimization and practical applications in e-learning systems.
arXiv Detail & Related papers (2021-07-31T15:42:03Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.