Related papers: State Combinatorial Generalization In Decision Making With Conditional Diffusion Models

State Combinatorial Generalization In Decision Making With Conditional Diffusion Models

URL: http://arxiv.org/abs/2501.13241v1
Date: Wed, 22 Jan 2025 21:48:40 GMT
Title: State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
Authors: Xintong Duan, Yutong He, Fahim Tajwar, Wen-Tse Chen, Ruslan Salakhutdinov, Jeff Schneider,
Abstract summary: We show how existing value-based reinforcement learning algorithms struggle due to unreliable value predictions in unseen states.<n>We argue that this problem cannot be addressed with exploration alone, but requires more expressive and generalizable models.<n>We show that conditioned diffusion models outperform traditional RL techniques and highlight the broad applicability of our problem formulation.
Score: 48.91240871813614
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many real-world decision-making problems are combinatorial in nature, where states (e.g., surrounding traffic of a self-driving car) can be seen as a combination of basic elements (e.g., pedestrians, trees, and other cars). Due to combinatorial complexity, observing all combinations of basic elements in the training set is infeasible, which leads to an essential yet understudied problem of zero-shot generalization to states that are unseen combinations of previously seen elements. In this work, we first formalize this problem and then demonstrate how existing value-based reinforcement learning (RL) algorithms struggle due to unreliable value predictions in unseen states. We argue that this problem cannot be addressed with exploration alone, but requires more expressive and generalizable models. We demonstrate that behavior cloning with a conditioned diffusion model trained on expert trajectory generalizes better to states formed by new combinations of seen elements than traditional RL methods. Through experiments in maze, driving, and multiagent environments, we show that conditioned diffusion models outperform traditional RL techniques and highlight the broad applicability of our problem formulation.

Related papers

From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning [83.94543243783285]
We study Complementary Reasoning, a complex task that requires integrating internal parametric knowledge with external contextual information.<n>We find that RL acts as a reasoning synthesizer rather than a probability amplifier.
arXiv Detail & Related papers (2025-12-01T18:27:25Z)
Calibrated Multimodal Representation Learning with Missing Modalities [100.55774771852468]
Multimodal representation learning harmonizes distinct modalities by aligning them into a unified latent space.<n>Recent research generalizes traditional cross-modal alignment to produce enhanced multimodal synergy but requires all modalities to be present for a common instance.<n>We provide theoretical insights into this issue from an anchor shift perspective.<n>We propose CalMRL for multimodal representation learning to calibrate incomplete alignments caused by missing modalities.
arXiv Detail & Related papers (2025-11-15T05:01:43Z)
Reasoning with Sampling: Your Base Model is Smarter Than You Think [52.639108524651846]
We propose a simple iterative sampling algorithm leveraging the base models' own likelihoods.<n>We show that our algorithm offers substantial boosts in reasoning that nearly match and even outperform those from RL.<n>Our method does not require training, curated datasets, or a verifier.
arXiv Detail & Related papers (2025-10-16T17:18:11Z)
Self-supervised Analogical Learning using Language Models [59.64260218737556]
We propose SAL, a self-supervised analogical learning framework. SAL mimics the human analogy process and trains models to explicitly transfer high-quality symbolic solutions. We show that the resulting models outperform base language models on a wide range of reasoning benchmarks.
arXiv Detail & Related papers (2025-02-03T02:31:26Z)
Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers [10.206921909332006]
This study investigates the internal mechanisms underlying Transformers' behavior in compositional tasks. We find that complexity control strategies influence whether the model learns primitive-level rules that generalize out-of-distribution (reasoning-based solutions) or relies solely on memorized mappings (memory-based solutions)
arXiv Detail & Related papers (2025-01-15T02:54:52Z)
Competition Dynamics Shape Algorithmic Phases of In-Context Learning [10.974593590868533]
In-Context Learning (ICL) has significantly expanded the general-purpose nature of large language models.<n>We propose a synthetic sequence modeling task that involves learning to simulate a finite mixture of Markov chains.<n>We show we can explain a model's behavior by decomposing it into four broad algorithms that combine a fuzzy retrieval vs. inference approach with either unigram or bigram statistics.
arXiv Detail & Related papers (2024-12-01T23:35:53Z)
The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse [25.002218722102505]
Model merging aims to efficiently combine the weights of multiple expert models, each trained on a specific task, into a single multi-task model. This work explores the more challenging scenario of "non-local" merging. Standard merging techniques often fail to generalize effectively in this non-local setting. We propose a multi-task technique to re-scale and shift the output activations of the merged model for each task, aligning its output statistics with those of the corresponding task-specific expert models.
arXiv Detail & Related papers (2024-10-16T17:41:59Z)
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation. We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z)
Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models [6.331947318187792]
We propose a comprehensive framework based on a "mixture of experts" rationale. This approach enables the data-based fusion of diverse local models. We penalize abrupt variations in the expert's combination to enhance interpretability.
arXiv Detail & Related papers (2024-01-30T15:53:07Z)
Controllable and Compositional Generation with Latent-Space Energy-Based Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications. In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes. By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z)
Reinforcement Learning as One Big Sequence Modeling Problem [84.84564880157149]
Reinforcement learning (RL) is typically concerned with estimating single-step policies or single-step models. We view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
arXiv Detail & Related papers (2021-06-03T17:58:51Z)
Model-Invariant State Abstractions for Model-Based Reinforcement Learning [54.616645151708994]
We introduce a new type of state abstraction called textitmodel-invariance. This allows for generalization to novel combinations of unseen values of state variables. We prove that an optimal policy can be learned over this model-invariance state abstraction.
arXiv Detail & Related papers (2021-02-19T10:37:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.