Related papers: Neuro-symbolic Training for Reasoning over Spatial Language

Related papers

Enhancing Spatial Reasoning through Visual and Textual Thinking [45.0026939683271]
The spatial reasoning task aims to reason about the spatial relationships in 2D and 3D space.<n>Although vision language models (VLMs) have developed rapidly in recent years, they are still struggling with the spatial reasoning task.<n>We introduce a method that can enhance spatial reasoning through Visual and Textual thinking Simultaneously.
arXiv Detail & Related papers (2025-07-28T05:24:54Z)
A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding [78.99798110890157]
Open-vocabulary 3D visual grounding aims to localize target objects based on free-form language queries.<n>Existing language field methods struggle to accurately localize instances using spatial relations in language queries.<n>We propose SpatialReasoner, a novel neural representation-based framework with large language model (LLM)-driven spatial reasoning.
arXiv Detail & Related papers (2025-07-09T10:20:38Z)
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing [62.447497430479174]
Drawing to reason in space is a novel paradigm that enables LVLMs to reason through elementary drawing operations in the visual space.<n>Our model, named VILASR, consistently outperforms existing methods across diverse spatial reasoning benchmarks.
arXiv Detail & Related papers (2025-06-11T17:41:50Z)
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models [14.442394137843923]
We present a detailed analysis that first delineates the core elements of spatial reasoning. We then assesses the performance of these models in both synthetic and real-world images.
arXiv Detail & Related papers (2025-03-25T14:34:06Z)
SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning [42.487500113839666]
We propose a novel approach to bolster the spatial reasoning capabilities of Vision-Language Models (VLMs) Our approach comprises two stages: spatial coordinate bi-directional alignment, and chain-of-thought spatial grounding. We evaluate our method on challenging navigation and manipulation tasks, both in simulation and real-world settings.
arXiv Detail & Related papers (2025-01-17T09:46:27Z)
SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation [7.659514491338669]
Current vision-language models may grasp basic spatial cues but struggle with the multi-dimensional spatial reasoning necessary for human-like understanding and real-world applications. We develop SPHERE, a hierarchical evaluation framework supported by a new human-annotated dataset. Benchmark evaluation of state-of-the-art models reveals significant deficiencies, especially in reasoning about distance and proximity.
arXiv Detail & Related papers (2024-12-17T09:10:55Z)
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations. We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z)
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data [53.433309883370974]
This work explores the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance Large Language Models' reasoning capabilities. Our experiments, conducted on two established natural language reasoning tasks, demonstrate that supervised fine-tuning with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.
arXiv Detail & Related papers (2024-09-19T03:39:09Z)
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence. We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z)
In-Context Analogical Reasoning with Pre-Trained Language Models [10.344428417489237]
We explore the use of intuitive language-based abstractions to support analogy in AI systems. Specifically, we apply large pre-trained language models (PLMs) to visual Raven's Progressive Matrices ( RPM) We find that PLMs exhibit a striking capacity for zero-shot relational reasoning, exceeding human performance and nearing supervised vision-based methods.
arXiv Detail & Related papers (2023-05-28T04:22:26Z)
ALERT: Adapting Language Models to Reasoning Tasks [43.8679673685468]
ALERT is a benchmark and suite of analyses for assessing language models' reasoning ability. ALERT provides a test bed to asses any language model on fine-grained reasoning skills. We find that language models learn more reasoning skills during finetuning stage compared to pretraining state.
arXiv Detail & Related papers (2022-12-16T05:15:41Z)
Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic [14.618731441943847]
We develop a novel framework that enables language models to be mathematically proficient while retaining their linguistic prowess. Specifically, we offer information-theoretic interventions to overcome the catastrophic forgetting of linguistic skills that occurs while injecting non-linguistic skills into language models.
arXiv Detail & Related papers (2022-11-03T18:53:30Z)
Learning to Reason With Relational Abstractions [65.89553417442049]
We study how to build stronger reasoning capability in language models using the idea of relational abstractions. We find that models that are supplied with such sequences as prompts can solve tasks with a significantly higher accuracy.
arXiv Detail & Related papers (2022-10-06T00:27:50Z)
Solving Quantitative Reasoning Problems with Language Models [53.53969870599973]
We introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content. The model achieves state-of-the-art performance on technical benchmarks without the use of external tools. We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences.
arXiv Detail & Related papers (2022-06-29T18:54:49Z)
Imagination-Augmented Natural Language Understanding [71.51687221130925]
We introduce an Imagination-Augmented Cross-modal (iACE) to solve natural language understanding tasks. iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models. Experiments on GLUE and SWAG show that iACE achieves consistent improvement over visually-supervised pre-trained models.
arXiv Detail & Related papers (2022-04-18T19:39:36Z)
Semantic Exploration from Language Abstractions and Pretrained Representations [23.02024937564099]
Effective exploration is a challenge in reinforcement learning (RL) We define novelty using semantically meaningful state abstractions. We evaluate vision-language representations, pretrained on natural image captioning datasets.
arXiv Detail & Related papers (2022-04-08T17:08:00Z)
Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z)
Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability [30.01308882849197]
We propose a new methodology for creating challenging algorithmic reasoning datasets. Key idea is to draw insights from empirical sampling of hard propositional SAT problems and from complexity-theoretic studies of language. We find that current transformers, given sufficient training data, are surprisingly robust at solving the resulting NLSat problems.
arXiv Detail & Related papers (2021-12-16T17:47:20Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z)
Learning to Generalize for Sequential Decision Making [19.075378799280728]
We introduce a teacher-student imitation learning methodology and a means of converting a reinforcement learning model into a natural language understanding model. We show that models can learn faster and generalize more, leveraging both the imitation learning and the reformulation.
arXiv Detail & Related papers (2020-10-05T18:00:03Z)
From Spatial Relations to Spatial Configurations [64.21025426604274]
spatial relation language is able to represent a large, comprehensive set of spatial concepts crucial for reasoning. We show how we extend the capabilities of existing spatial representation languages with the fine-grained decomposition of semantics.
arXiv Detail & Related papers (2020-07-19T02:11:53Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)
Data Augmentation for Spoken Language Understanding via Pretrained Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity. We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.