Related papers: Sketch-Plan-Generalize: Learning and Planning with Neuro-Symbolic Programmatic Representations for Inductive Spatial Concepts

Sketch-Plan-Generalize: Learning and Planning with Neuro-Symbolic Programmatic Representations for Inductive Spatial Concepts

URL: http://arxiv.org/abs/2404.07774v3
Date: Tue, 17 Jun 2025 11:11:09 GMT
Title: Sketch-Plan-Generalize: Learning and Planning with Neuro-Symbolic Programmatic Representations for Inductive Spatial Concepts
Authors: Namasivayam Kalithasan, Sachit Sachdeva, Himanshu Gaurav Singh, Vishal Bindal, Arnav Tuli, Gurarmaan Singh Panjeta, Harsh Himanshu Vora, Divyanshu Aggarwal, Rohan Paul, Parag Singla,
Abstract summary: We develop an approach to learn personalized concepts from a limited number of demonstrations.<n>Our pipeline facilitates generalization and modular re-use, enabling continual concept learning.
Score: 6.708785987015999
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Effective human-robot collaboration requires the ability to learn personalized concepts from a limited number of demonstrations, while exhibiting inductive generalization, hierarchical composition, and adaptability to novel constraints. Existing approaches that use code generation capabilities of pre-trained large (vision) language models as well as purely neural models show poor generalization to \emph{a-priori} unseen complex concepts. Neuro-symbolic methods (Grand et al., 2023) offer a promising alternative by searching in program space, but face challenges in large program spaces due to the inability to effectively guide the search using demonstrations. Our key insight is to factor inductive concept learning as: (i) {\it Sketch:} detecting and inferring a coarse signature of a new concept (ii) {\it Plan:} performing an MCTS search over grounded action sequences guided by human demonstrations (iii) {\it Generalize:} abstracting out grounded plans as inductive programs. Our pipeline facilitates generalization and modular re-use, enabling continual concept learning. Our approach combines the benefits of code generation ability of large language models (LLMs) along with grounded neural representations, resulting in neuro-symbolic programs that show stronger inductive generalization on the task of constructing complex structures vis-\'a-vis LLM-only and purely neural approaches. Further, we demonstrate reasoning and planning capabilities with learned concepts for embodied instruction following.

Related papers

Embryology of a Language Model [1.1874560263468232]
In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training.<n>Our visualizations reveal the emergence of a clear body plan'' charting the formation of known features like the induction circuit and discovering previously unknown structures.
arXiv Detail & Related papers (2025-08-01T05:39:41Z)
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO [87.52631406241456]
Recent text-to-image systems face limitations in handling multimodal inputs and complex reasoning tasks.<n>We introduce Mind Omni, a unified multimodal large language model that addresses these challenges by incorporating reasoning generation through reinforcement learning.
arXiv Detail & Related papers (2025-05-19T12:17:04Z)
Cognitive maps are generative programs [13.339419436986148]
We show that cognitive maps can take the form of generative programs that exploit predictability and redundancy.<n>We describe a computational model that predicts human behavior in a variety of structured scenarios.<n>Our models leverage a Large Language Model as an embedding of human priors, implicitly learned through training on a vast corpus of human data.
arXiv Detail & Related papers (2025-04-29T10:55:40Z)
Meta-Representational Predictive Coding: Biomimetic Self-Supervised Learning [51.22185316175418]
We present a new form of predictive coding that we call meta-representational predictive coding (MPC)<n>MPC sidesteps the need for learning a generative model of sensory input by learning to predict representations of sensory input across parallel streams.
arXiv Detail & Related papers (2025-03-22T22:13:14Z)
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning [86.59849798539312]
We present Neuro-Symbolic Predicates, a first-order abstraction language that combines the strengths of symbolic and neural knowledge representations.<n>We show that our approach offers better sample complexity, stronger out-of-distribution generalization, and improved interpretability.
arXiv Detail & Related papers (2024-10-30T16:11:05Z)
Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence. Recent trends demonstrate the potential homogeneity of these two fields. We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z)
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination. Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z)
A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data. The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept. The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z)
Learning Differentiable Logic Programs for Abstract Visual Reasoning [22.167393386879294]
Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms.<n>NEUMANN is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors.<n>We demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.
arXiv Detail & Related papers (2023-07-03T11:02:40Z)
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning [1.3535770763481902]
We propose a new framework for predictive coding called active predictive coding. It can learn hierarchical world models and solve two radically different open problems in AI.
arXiv Detail & Related papers (2022-10-23T05:44:22Z)
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts. In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images. We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization. Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
Revisit Systematic Generalization via Meaningful Learning [15.90288956294373]
Recent studies argue that neural networks appear inherently ineffective in such cognitive capacity. We reassess the compositional skills of sequence-to-sequence models conditioned on the semantic links between new and old concepts.
arXiv Detail & Related papers (2020-03-14T15:27:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.