Multi-Granularity Modularized Network for Abstract Visual Reasoning
- URL: http://arxiv.org/abs/2007.04670v2
- Date: Fri, 10 Jul 2020 02:32:25 GMT
- Title: Multi-Granularity Modularized Network for Abstract Visual Reasoning
- Authors: Xiangru Tang, Haoyuan Wang, Xiang Pan, Jiyang Qi
- Abstract summary: We focus on the Raven Progressive Matrices Test, designed to measure cognitive reasoning.
Inspired by cognitive studies, we propose a Multi-Granularity Modularized Network (MMoN) to bridge the gap between the processing of raw sensory information and symbolic reasoning.
- Score: 15.956555435408557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstract visual reasoning connects mental abilities to the physical world,
which is a crucial factor in cognitive development. Most toddlers display
sensitivity to this skill, but it is not easy for machines. Aimed at it, we
focus on the Raven Progressive Matrices Test, designed to measure cognitive
reasoning. Recent work designed some black-boxes to solve it in an end-to-end
fashion, but they are incredibly complicated and difficult to explain. Inspired
by cognitive studies, we propose a Multi-Granularity Modularized Network (MMoN)
to bridge the gap between the processing of raw sensory information and
symbolic reasoning. Specifically, it learns modularized reasoning functions to
model the semantic rule from the visual grounding in a neuro-symbolic and
semi-supervision way. To comprehensively evaluate MMoN, our experiments are
conducted on the dataset of both seen and unseen reasoning rules. The result
shows that MMoN is well suited for abstract visual reasoning and also
explainable on the generalization test.
Related papers
- OC-NMN: Object-centric Compositional Neural Module Network for
Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination.
Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z) - A Cognitively-Inspired Neural Architecture for Visual Abstract Reasoning
Using Contrastive Perceptual and Conceptual Processing [14.201935774784632]
We introduce a new neural architecture for solving visual abstract reasoning tasks inspired by human cognition.
Inspired by this principle, our architecture models visual abstract reasoning as an iterative, self-contrasting learning process.
Experiments on the machine learning dataset RAVEN show that CPCNet achieves higher accuracy than all previously published models.
arXiv Detail & Related papers (2023-09-19T11:18:01Z) - Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play
Multi-Character Belief Tracker [72.09076317574238]
ToM is a plug-and-play approach to investigate the belief states of characters in reading comprehension.
We show that ToM enhances off-the-shelf neural network theory mind in a zero-order setting while showing robust out-of-distribution performance compared to supervised baselines.
arXiv Detail & Related papers (2023-06-01T17:24:35Z) - Deep Non-Monotonic Reasoning for Visual Abstract Reasoning Tasks [3.486683381782259]
This paper proposes a non-monotonic computational approach to solve visual abstract reasoning tasks.
We implement a deep learning model using this approach and tested it on the RAVEN dataset -- a dataset inspired by the Raven's Progressive Matrices test.
arXiv Detail & Related papers (2023-02-08T16:35:05Z) - Visual Superordinate Abstraction for Robust Concept Learning [80.15940996821541]
Concept learning constructs visual representations that are connected to linguistic semantics.
We ascribe the bottleneck to a failure of exploring the intrinsic semantic hierarchy of visual concepts.
We propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces.
arXiv Detail & Related papers (2022-05-28T14:27:38Z) - Understanding the computational demands underlying visual reasoning [10.308647202215708]
We systematically assess the ability of modern deep convolutional neural networks to learn to solve visual reasoning problems.
Our analysis leads to a novel taxonomy of visual reasoning tasks, which can be primarily explained by the type of relations and the number of relations used to compose the underlying rules.
arXiv Detail & Related papers (2021-08-08T10:46:53Z) - Expressive Explanations of DNNs by Combining Concept Analysis with ILP [0.3867363075280543]
We use inherent features learned by the network to build a global, expressive, verbal explanation of the rationale of a feed-forward convolutional deep neural network (DNN)
We show that our explanation is faithful to the original black-box model.
arXiv Detail & Related papers (2021-05-16T07:00:27Z) - Object-Centric Diagnosis of Visual Reasoning [118.36750454795428]
This paper presents a systematical object-centric diagnosis of visual reasoning on grounding and robustness.
We develop a diagnostic model, namely Graph Reasoning Machine.
Our model replaces purely symbolic visual representation with probabilistic scene graph and then applies teacher-forcing training for the visual reasoning module.
arXiv Detail & Related papers (2020-12-21T18:59:28Z) - Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" [49.76230210108583]
We propose a framework to isolate and evaluate the reasoning aspect of visual question answering (VQA) separately from its perception.
We also propose a novel top-down calibration technique that allows the model to answer reasoning questions even with imperfect perception.
On the challenging GQA dataset, this framework is used to perform in-depth, disentangled comparisons between well-known VQA models.
arXiv Detail & Related papers (2020-06-20T08:48:29Z) - Machine Number Sense: A Dataset of Visual Arithmetic Problems for
Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG)
These visual arithmetic problems are in the form of geometric figures.
We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.