Learning Differentiable Logic Programs for Abstract Visual Reasoning
- URL: http://arxiv.org/abs/2307.00928v1
- Date: Mon, 3 Jul 2023 11:02:40 GMT
- Title: Learning Differentiable Logic Programs for Abstract Visual Reasoning
- Authors: Hikaru Shindo, Viktor Pfanschilling, Devendra Singh Dhami, Kristian
Kersting
- Abstract summary: Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms.
NEUMANN is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors.
We demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.
- Score: 18.82429807065658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual reasoning is essential for building intelligent agents that understand
the world and perform problem-solving beyond perception. Differentiable forward
reasoning has been developed to integrate reasoning with gradient-based machine
learning paradigms. However, due to the memory intensity, most existing
approaches do not bring the best of the expressivity of first-order logic,
excluding a crucial ability to solve abstract visual reasoning, where agents
need to perform reasoning by using analogies on abstract concepts in different
scenarios. To overcome this problem, we propose NEUro-symbolic Message-pAssiNg
reasoNer (NEUMANN), which is a graph-based differentiable forward reasoner,
passing messages in a memory-efficient manner and handling structured programs
with functors. Moreover, we propose a computationally-efficient structure
learning algorithm to perform explanatory program induction on complex visual
scenes. To evaluate, in addition to conventional visual reasoning tasks, we
propose a new task, visual reasoning behind-the-scenes, where agents need to
learn abstract programs and then answer queries by imagining scenes that are
not observed. We empirically demonstrate that NEUMANN solves visual reasoning
tasks efficiently, outperforming neural, symbolic, and neuro-symbolic
baselines.
Related papers
- VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations.
We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z) - Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks.
We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture.
Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Visual Chain of Thought: Bridging Logical Gaps with Multimodal
Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data.
Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z) - Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks.
We train a neural network to parameterize an energy landscape over all outputs.
We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z) - GAMR: A Guided Attention Model for (visual) Reasoning [7.919213739992465]
Humans continue to outperform modern AI systems in their ability to flexibly parse and understand complex visual scenes.
We present a novel module for visual reasoning, the Guided Attention Model for (visual) Reasoning (GAMR)
GAMR posits that the brain solves complex visual reasoning problems dynamically via sequences of attention shifts to select and route task-relevant visual information into memory.
arXiv Detail & Related papers (2022-06-10T07:52:06Z) - Joint Abductive and Inductive Neural Logical Reasoning [44.36651614420507]
We formulate the problem of the joint abductive and inductive neural logical reasoning (AI-NLR)
First, we incorporate description logic-based ontological axioms to provide the source of concepts.
Then, we represent concepts and queries as fuzzy sets, i.e., sets whose elements have degrees of membership, to bridge concepts and queries with entities.
arXiv Detail & Related papers (2022-05-29T07:41:50Z) - Understanding the computational demands underlying visual reasoning [10.308647202215708]
We systematically assess the ability of modern deep convolutional neural networks to learn to solve visual reasoning problems.
Our analysis leads to a novel taxonomy of visual reasoning tasks, which can be primarily explained by the type of relations and the number of relations used to compose the underlying rules.
arXiv Detail & Related papers (2021-08-08T10:46:53Z) - Machine Number Sense: A Dataset of Visual Arithmetic Problems for
Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG)
These visual arithmetic problems are in the form of geometric figures.
We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.