Neuro-symbolic Training for Reasoning over Spatial Language
- URL: http://arxiv.org/abs/2406.13828v1
- Date: Wed, 19 Jun 2024 20:47:36 GMT
- Title: Neuro-symbolic Training for Reasoning over Spatial Language
- Authors: Tanawan Premsri, Parisa Kordjamshidi,
- Abstract summary: We propose training language models with neuro-symbolic techniques that can exploit the logical rules of reasoning as constraints.
We focus on a challenging problem of spatial reasoning over text.
- Score: 17.901249830817882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research shows that more data and larger models can provide more accurate solutions to natural language problems requiring reasoning. However, models can easily fail to provide solutions in unobserved complex input compositions due to not achieving the level of abstraction required for generalizability. To alleviate this issue, we propose training the language models with neuro-symbolic techniques that can exploit the logical rules of reasoning as constraints and provide additional supervision sources to the model. Training models to adhere to the regulations of reasoning pushes them to make more effective abstractions needed for generalizability and transfer learning. We focus on a challenging problem of spatial reasoning over text. Our results on various benchmarks using multiple language models confirm our hypothesis of effective domain transfer based on neuro-symbolic training.
Related papers
- Exploring Spatial Language Grounding Through Referring Expressions [17.524558622186657]
We propose using the Referring Expression task as a platform for the evaluation of spatial reasoning by Vision-language models (VLMs)
This platform provides the opportunity for a deeper analysis of spatial comprehension and grounding abilities when there is 1) ambiguity in object detection, 2) complex spatial expressions with a longer sentence structure and multiple spatial relations, and 3) expressions with negation ('not')
Our results highlight these challenges and behaviors and provide insight into research gaps and future directions.
arXiv Detail & Related papers (2025-02-04T22:58:15Z) - SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning [42.487500113839666]
We propose a novel approach to bolster the spatial reasoning capabilities of Vision-Language Models (VLMs)
Our approach comprises two stages: spatial coordinate bi-directional alignment, and chain-of-thought spatial grounding.
We evaluate our method on challenging navigation and manipulation tasks, both in simulation and real-world settings.
arXiv Detail & Related papers (2025-01-17T09:46:27Z) - SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation [7.659514491338669]
Current vision-language models may grasp basic spatial cues but struggle with the multi-dimensional spatial reasoning necessary for human-like understanding and real-world applications.
We develop SPHERE, a hierarchical evaluation framework supported by a new human-annotated dataset.
Benchmark evaluation of state-of-the-art models reveals significant deficiencies, especially in reasoning about distance and proximity.
These findings expose critical blind spots in existing models and underscore the need for more advanced spatial reasoning techniques.
arXiv Detail & Related papers (2024-12-17T09:10:55Z) - VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations.
We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z) - SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence.
We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z) - In-Context Analogical Reasoning with Pre-Trained Language Models [10.344428417489237]
We explore the use of intuitive language-based abstractions to support analogy in AI systems.
Specifically, we apply large pre-trained language models (PLMs) to visual Raven's Progressive Matrices ( RPM)
We find that PLMs exhibit a striking capacity for zero-shot relational reasoning, exceeding human performance and nearing supervised vision-based methods.
arXiv Detail & Related papers (2023-05-28T04:22:26Z) - Imagination-Augmented Natural Language Understanding [71.51687221130925]
We introduce an Imagination-Augmented Cross-modal (iACE) to solve natural language understanding tasks.
iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models.
Experiments on GLUE and SWAG show that iACE achieves consistent improvement over visually-supervised pre-trained models.
arXiv Detail & Related papers (2022-04-18T19:39:36Z) - Semantic Exploration from Language Abstractions and Pretrained
Representations [23.02024937564099]
Effective exploration is a challenge in reinforcement learning (RL)
We define novelty using semantically meaningful state abstractions.
We evaluate vision-language representations, pretrained on natural image captioning datasets.
arXiv Detail & Related papers (2022-04-08T17:08:00Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - From Spatial Relations to Spatial Configurations [64.21025426604274]
spatial relation language is able to represent a large, comprehensive set of spatial concepts crucial for reasoning.
We show how we extend the capabilities of existing spatial representation languages with the fine-grained decomposition of semantics.
arXiv Detail & Related papers (2020-07-19T02:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.