Related papers: V-LoL: A Diagnostic Dataset for Visual Logical Learning

V-LoL: A Diagnostic Dataset for Visual Logical Learning

URL: http://arxiv.org/abs/2306.07743v3
Date: Wed, 13 Nov 2024 12:43:33 GMT
Title: V-LoL: A Diagnostic Dataset for Visual Logical Learning
Authors: Lukas Helff, Wolfgang Stammer, Hikaru Shindo, Devendra Singh Dhami, Kristian Kersting,
Abstract summary: We propose the diagnostic visual logical learning dataset, V-LoL, that seamlessly combines visual and logical challenges. V-LoL-Train provides a platform for investigating a wide range of visual logical learning challenges. We evaluate a variety of AI systems including traditional symbolic AI, neural AI, as well as neuro-symbolic AI.
Score: 22.971426186079235
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the successes of recent developments in visual AI, different shortcomings still exist; from missing exact logical reasoning, to abstract generalization abilities, to understanding complex and noisy scenes. Unfortunately, existing benchmarks, were not designed to capture more than a few of these aspects. Whereas deep learning datasets focus on visually complex data but simple visual reasoning tasks, inductive logic datasets involve complex logical learning tasks, however, lack the visual component. To address this, we propose the diagnostic visual logical learning dataset, V-LoL, that seamlessly combines visual and logical challenges. Notably, we introduce the first instantiation of V-LoL, V-LoL-Train, - a visual rendition of a classic benchmark in symbolic AI, the Michalski train problem. By incorporating intricate visual scenes and flexible logical reasoning tasks within a versatile framework, V-LoL-Train provides a platform for investigating a wide range of visual logical learning challenges. We evaluate a variety of AI systems including traditional symbolic AI, neural AI, as well as neuro-symbolic AI. Our evaluations demonstrate that even SOTA AI faces difficulties in dealing with visual logical learning challenges, highlighting unique advantages and limitations of each methodology. Overall, V-LoL opens up new avenues for understanding and enhancing current abilities in visual logical learning for AI systems.

Related papers

VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow [57.96482272333649]
Feature visualization (FV) is a powerful tool to decode what information neurons are responding to. We propose to guide FV through statistics of prototypical image features combined with measures of relevant network flow to generate images. Our approach yields human-understandable visualizations that both qualitatively and quantitatively improve over state-of-the-art FVs.
arXiv Detail & Related papers (2025-03-28T13:08:18Z)
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs [3.2228025627337864]
This paper introduces a structured evaluation framework using Bongard Problems (BPs) to dissect the perception-reasoning interface in Vision-Language Models (VLMs) We propose three distinct evaluation paradigms, mirroring human problem-solving strategies. Our framework provides a valuable diagnostic tool, highlighting the need to enhance visual processing fidelity for achieving more robust and human-like visual intelligence in AI.
arXiv Detail & Related papers (2025-01-23T12:42:42Z)
Neural-Symbolic Reasoning over Knowledge Graphs: A Survey from a Query Perspective [55.79507207292647]
Knowledge graph reasoning is pivotal in various domains such as data mining, artificial intelligence, the Web, and social sciences. The rise of Neural AI marks a significant advancement, merging the robustness of deep learning with the precision of symbolic reasoning. The advent of large language models (LLMs) has opened new frontiers in knowledge graph reasoning.
arXiv Detail & Related papers (2024-11-30T18:54:08Z)
CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs [74.36850397755572]
CATCH addresses issues related to visual defects that cause diminished fine-grained feature perception and cumulative hallucinations in open-ended scenarios. It is applicable to various visual question-answering tasks without requiring any specific data or prior knowledge, and generalizes robustly to new tasks without additional training.
arXiv Detail & Related papers (2024-11-19T18:27:31Z)
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation [60.920536939067524]
We introduce LogiCity, the first simulator based on customizable first-order logic (FOL) for an urban-like environment with multiple dynamic agents. LogiCity models diverse urban elements using semantic and spatial concepts, such as IsAmbulance(X) and IsClose(X, Y) Key feature of LogiCity is its support for user-configurable abstractions, enabling customizable simulation complexities for logical reasoning.
arXiv Detail & Related papers (2024-11-01T17:59:46Z)
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning [89.89857766491475]
We propose a complex reasoning schema over KG upon large language models (LLMs) We augment the arbitrary first-order logical queries via binary tree decomposition to stimulate the reasoning capability of LLMs. Experiments across widely used datasets demonstrate that LACT has substantial improvements(brings an average +5.5% MRR score) over advanced methods.
arXiv Detail & Related papers (2024-05-02T18:12:08Z)
Visual AI and Linguistic Intelligence Through Steerability and Composability [0.0]
This study explores the capabilities of multimodal large language models (LLMs) in handling challenging multistep tasks that integrate language and vision. The research presents a series of 14 creatively and constructively diverse tasks, ranging from AI Lego Designing to AI Satellite Image Analysis.
arXiv Detail & Related papers (2023-11-18T22:01:33Z)
LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge. During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training. These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z)
Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z)
A Benchmark for Compositional Visual Reasoning [5.576460160219606]
We introduce a novel visual reasoning benchmark, Compositional Visual Relations (CVR), to drive progress towards more data-efficient learning algorithms. We take inspiration from fluidic intelligence and non-verbal reasoning tests and describe a novel method for creating compositions of abstract rules and associated image datasets at scale. Our proposed benchmark includes measures of sample efficiency, generalization and transfer across task rules, as well as the ability to leverage compositionality.
arXiv Detail & Related papers (2022-06-11T00:04:49Z)
GAMR: A Guided Attention Model for (visual) Reasoning [7.919213739992465]
Humans continue to outperform modern AI systems in their ability to flexibly parse and understand complex visual scenes. We present a novel module for visual reasoning, the Guided Attention Model for (visual) Reasoning (GAMR) GAMR posits that the brain solves complex visual reasoning problems dynamically via sequences of attention shifts to select and route task-relevant visual information into memory.
arXiv Detail & Related papers (2022-06-10T07:52:06Z)
Logic Tensor Networks [9.004005678155023]
We present Logic Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks.
arXiv Detail & Related papers (2020-12-25T22:30:18Z)
Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning [95.18337034090648]
We propose a dataset, Machine Number Sense (MNS), consisting of visual arithmetic problems automatically generated using a grammar model--And-Or Graph (AOG) These visual arithmetic problems are in the form of geometric figures. We benchmark the MNS dataset using four predominant neural network models as baselines in this visual reasoning task.
arXiv Detail & Related papers (2020-04-25T17:14:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.