Rational Inverse Reasoning
- URL: http://arxiv.org/abs/2508.08983v1
- Date: Tue, 12 Aug 2025 14:49:44 GMT
- Title: Rational Inverse Reasoning
- Authors: Ben Zandonati, Tomás Lozano-Pérez, Leslie Pack Kaelbling,
- Abstract summary: We introduce Rational Inverse Reasoning (RIR), a framework for inferring latent programs through a hierarchical generative model of behavior.<n>RIR infers the intended task structure and generalizes to novel settings, outperforming state-of-the-art vision-language model baselines.
- Score: 33.835770809482085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans can observe a single, imperfect demonstration and immediately generalize to very different problem settings. Robots, in contrast, often require hundreds of examples and still struggle to generalize beyond the training conditions. We argue that this limitation arises from the inability to recover the latent explanations that underpin intelligent behavior, and that these explanations can take the form of structured programs consisting of high-level goals, sub-task decomposition, and execution constraints. In this work, we introduce Rational Inverse Reasoning (RIR), a framework for inferring these latent programs through a hierarchical generative model of behavior. RIR frames few-shot imitation as Bayesian program induction: a vision-language model iteratively proposes structured symbolic task hypotheses, while a planner-in-the-loop inference scheme scores each by the likelihood of the observed demonstration under that hypothesis. This loop yields a posterior over concise, executable programs. We evaluate RIR on a suite of continuous manipulation tasks designed to test one-shot and few-shot generalization across variations in object pose, count, geometry, and layout. With as little as one demonstration, RIR infers the intended task structure and generalizes to novel settings, outperforming state-of-the-art vision-language model baselines.
Related papers
- On the Out-of-Distribution Generalization of Reasoning in Multimodal LLMs for Simple Visual Planning Tasks [56.98385132295952]
We evaluate how well chain-of-thought approaches generalize on a simple planning task.<n>We find that reasoning traces which combine multiple text formats yield the best (and non-trivial) OOD generalization.<n> purely text-based models consistently outperform those utilizing image-based inputs.
arXiv Detail & Related papers (2026-02-17T09:51:40Z) - Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training [76.12556589212666]
We show that curriculum post-training avoids the exponential complexity bottleneck.<n>Under outcome-only reward signals, reinforcement learning finetuning achieves high accuracy with sample complexity.<n>We establish guarantees for test-time scaling, where curriculum-aware querying reduces both reward oracle calls and sampling cost from exponential to order.
arXiv Detail & Related papers (2025-11-10T18:29:54Z) - A Study of Rule Omission in Raven's Progressive Matrices [0.0]
Analogical reasoning lies at the core of human cognition and remains a fundamental challenge for artificial intelligence.<n>This study investigates the generalization capacity of modern AI systems under conditions of incomplete training.<n>Experiments reveal that although transformers demonstrate strong performance on familiar rules, their accuracy declines sharply when faced with novel or omitted rules.
arXiv Detail & Related papers (2025-10-03T15:53:28Z) - Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models [57.42778606399764]
Diffusion language models (dLLMs) offer a promising, non-autoregressive paradigm for text generation.<n>Current reinforcement learning approaches often rely on sparse, outcome-based rewards.<n>We argue that this stems from a fundamental mismatch with the natural structure of reasoning.
arXiv Detail & Related papers (2025-10-02T00:34:15Z) - BOOST: Bootstrapping Strategy-Driven Reasoning Programs for Program-Guided Fact-Checking [16.655011153015202]
BOOST is a bootstrapping approach for automated few-shot reasoning program generation.<n>It iteratively refines explicit, data-driven guidelines as meta-rules for guiding demonstration creation.<n>It enables a seamless transition from zero-shot to few-shot program-guided learning, enhancing interpretability and effectiveness.
arXiv Detail & Related papers (2025-04-03T10:38:45Z) - On the Diagram of Thought [12.304069891580658]
Current large language models (LLMs) demonstrate impressive capabilities but struggle with complex, multi-step reasoning tasks.<n>We introduce the Diagram of Thought (DoT) as a framework wherein a single auto-regressive LLM internally constructs and navigates a Directed Acyclic Graph (DAG)<n>We formalize the reasoning DAG as a diagram within a suitable topos and prove that the final step, aggregating validated information, corresponds semantically to computing the colimit of the relevant sub-diagram.
arXiv Detail & Related papers (2024-09-16T07:01:41Z) - Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers [54.83459025465947]
Even the largest models struggle with compositional reasoning, generalization, fine-grained spatial and temporal reasoning, and counting.
Visual reasoning with large language models (LLMs) as controllers can, in principle, address these limitations by decomposing the task and solving subtasks by orchestrating a set of (visual) tools.
We present a framework that mitigates these issues by introducing spatially and temporally abstract routines and by leveraging a small number of labeled examples to automatically generate in-context examples.
arXiv Detail & Related papers (2024-01-03T20:48:47Z) - De-fine: Decomposing and Refining Visual Programs with Auto-Feedback [75.62712247421146]
De-fine is a training-free framework that decomposes complex tasks into simpler subtasks and refines programs through auto-feedback.
Our experiments across various visual tasks show that De-fine creates more robust programs.
arXiv Detail & Related papers (2023-11-21T06:24:09Z) - Rationale-Augmented Ensembles in Language Models [53.45015291520658]
We reconsider rationale-augmented prompting for few-shot in-context learning.
We identify rationale sampling in the output space as the key component to robustly improve performance.
We demonstrate that rationale-augmented ensembles achieve more accurate and interpretable results than existing prompting approaches.
arXiv Detail & Related papers (2022-07-02T06:20:57Z) - Abstraction-Refinement for Hierarchical Probabilistic Models [8.959154445409057]
We exploit a hierarchical structure with repetitive parts to verify Markov decision processes.
In this paper, we focus on a local case, in which the subroutines have a limited effect on the overall system state.
The key ideas to accelerate analysis of such programs are (1) to treat the behavior of the subroutine as uncertain and only remove this uncertainty by a detailed analysis if needed, and (2) to abstract similar subroutines into a parametric template, and then analyse this template.
arXiv Detail & Related papers (2022-06-06T14:44:36Z) - DAReN: A Collaborative Approach Towards Reasoning And Disentangling [27.50150027974947]
We propose an end-to-end joint representation-reasoning learning framework, which leverages a weak form of inductive bias to improve both tasks together.
We accomplish this using a novel learning framework Disentangling based Abstract Reasoning Network (DAReN) based on the principles of GM-RPM.
arXiv Detail & Related papers (2021-09-27T16:10:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.