Related papers: Learning Differentiable Logic Programs for Abstract Visual Reasoning

Learning Differentiable Logic Programs for Abstract Visual Reasoning

URL: http://arxiv.org/abs/2307.00928v1
Date: Mon, 3 Jul 2023 11:02:40 GMT
Title: Learning Differentiable Logic Programs for Abstract Visual Reasoning
Authors: Hikaru Shindo, Viktor Pfanschilling, Devendra Singh Dhami, Kristian Kersting
Abstract summary: Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms. NEUMANN is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors. We demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.
Score: 18.82429807065658
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms. However, due to the memory intensity, most existing approaches do not bring the best of the expressivity of first-order logic, excluding a crucial ability to solve abstract visual reasoning, where agents need to perform reasoning by using analogies on abstract concepts in different scenarios. To overcome this problem, we propose NEUro-symbolic Message-pAssiNg reasoNer (NEUMANN), which is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors. Moreover, we propose a computationally-efficient structure learning algorithm to perform explanatory program induction on complex visual scenes. To evaluate, in addition to conventional visual reasoning tasks, we propose a new task, visual reasoning behind-the-scenes, where agents need to learn abstract programs and then answer queries by imagining scenes that are not observed. We empirically demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.

Related papers

Towards Unified Neurosymbolic Reasoning on Knowledge Graphs [37.22138524925735]
Knowledge Graph (KG) reasoning has received significant attention in the fields of artificial intelligence and knowledge engineering.<n>We propose a unified neurosymbolic reasoning framework, namely Tunsr, for KG reasoning.
arXiv Detail & Related papers (2025-07-04T16:29:45Z)
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing [62.447497430479174]
Drawing to reason in space is a novel paradigm that enables LVLMs to reason through elementary drawing operations in the visual space.<n>Our model, named VILASR, consistently outperforms existing methods across diverse spatial reasoning benchmarks.
arXiv Detail & Related papers (2025-06-11T17:41:50Z)
Visualizing Thought: Conceptual Diagrams Enable Robust Planning in LMMs [57.66267515456075]
Large Language Models (LLMs) and Large Multimodal Models (LMMs) predominantly reason through textual representations. We propose a zero-shot fully automatic framework that enables LMMs to reason through multiple chains of self-generated conceptual diagrams.
arXiv Detail & Related papers (2025-03-14T18:27:02Z)
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs [3.2228025627337864]
This paper introduces a structured evaluation framework using Bongard Problems (BPs) to dissect the perception-reasoning interface in Vision-Language Models (VLMs) We propose three distinct evaluation paradigms, mirroring human problem-solving strategies. Our framework provides a valuable diagnostic tool, highlighting the need to enhance visual processing fidelity for achieving more robust and human-like visual intelligence in AI.
arXiv Detail & Related papers (2025-01-23T12:42:42Z)
Abductive Symbolic Solver on Abstraction and Reasoning Corpus [5.903948032748941]
Humans solve visual reasoning tasks based on their observations and hypotheses, and they can explain their solutions with a proper reason. Previous approaches focused only on the grid transition and it is not enough for AI to provide reasonable and human-like solutions. We propose a novel framework that symbolically represents the observed data into a knowledge graph and extracts core knowledge that can be used for solution generation.
arXiv Detail & Related papers (2024-11-27T09:09:00Z)
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations. We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z)
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks. We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture. Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z)
LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge. During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training. These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z)
Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data. Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z)
Learning Iterative Reasoning through Energy Minimization [77.33859525900334]
We present a new framework for iterative reasoning with neural networks. We train a neural network to parameterize an energy landscape over all outputs. We implement each step of the iterative reasoning as an energy minimization step to find a minimal energy solution.
arXiv Detail & Related papers (2022-06-30T17:44:20Z)
GAMR: A Guided Attention Model for (visual) Reasoning [7.919213739992465]
Humans continue to outperform modern AI systems in their ability to flexibly parse and understand complex visual scenes. We present a novel module for visual reasoning, the Guided Attention Model for (visual) Reasoning (GAMR) GAMR posits that the brain solves complex visual reasoning problems dynamically via sequences of attention shifts to select and route task-relevant visual information into memory.
arXiv Detail & Related papers (2022-06-10T07:52:06Z)
Joint Abductive and Inductive Neural Logical Reasoning [44.36651614420507]
We formulate the problem of the joint abductive and inductive neural logical reasoning (AI-NLR) First, we incorporate description logic-based ontological axioms to provide the source of concepts. Then, we represent concepts and queries as fuzzy sets, i.e., sets whose elements have degrees of membership, to bridge concepts and queries with entities.
arXiv Detail & Related papers (2022-05-29T07:41:50Z)
Understanding the computational demands underlying visual reasoning [10.308647202215708]
We systematically assess the ability of modern deep convolutional neural networks to learn to solve visual reasoning problems. Our analysis leads to a novel taxonomy of visual reasoning tasks, which can be primarily explained by the type of relations and the number of relations used to compose the underlying rules.
arXiv Detail & Related papers (2021-08-08T10:46:53Z)
Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG) These visual arithmetic problems are in the form of geometric figures. We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.