SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
- URL: http://arxiv.org/abs/2505.12109v1
- Date: Sat, 17 May 2025 18:34:31 GMT
- Title: SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
- Authors: Matthew Landers, Taylor W. Killian, Thomas Hartvigsen, Afsaneh Doryab,
- Abstract summary: Sub-Action Interaction Network (SAINT) is a novel policy architecture that represents multi-component actions as unordered sets and models their dependencies via self-attention conditioned on the global state.<n>In 15 distinct environments across three task domains, including environments with nearly 17 million joint actions, SAINT consistently outperforms strong baselines.
- Score: 13.673494183777716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The combinatorial structure of many real-world action spaces leads to exponential growth in the number of possible actions, limiting the effectiveness of conventional reinforcement learning algorithms. Recent approaches for combinatorial action spaces impose factorized or sequential structures over sub-actions, failing to capture complex joint behavior. We introduce the Sub-Action Interaction Network using Transformers (SAINT), a novel policy architecture that represents multi-component actions as unordered sets and models their dependencies via self-attention conditioned on the global state. SAINT is permutation-invariant, sample-efficient, and compatible with standard policy optimization algorithms. In 15 distinct combinatorial environments across three task domains, including environments with nearly 17 million joint actions, SAINT consistently outperforms strong baselines.
Related papers
- Primary-Fine Decoupling for Action Generation in Robotic Imitation [91.2899765310853]
Multi-modal distribution in robotic manipulation action sequences poses critical challenges for imitation learning.<n>We propose Primary-Fine Decoupling for Action Generation (PF-DAG), a two-stage framework that decouples coarse action consistency from fine-grained variations.<n>PF-DAG outperforms state-of-the-art baselines across 56 tasks from Adroit, DexArt, and MetaWorld benchmarks.
arXiv Detail & Related papers (2026-02-25T08:36:45Z) - Improving and Accelerating Offline RL in Large Discrete Action Spaces with Structured Policy Initialization [11.646124619395486]
Reinforcement learning in discrete action spaces requires searching over exponentially many joint actions to simultaneously select multiple sub-actions that form coherent combinations.<n>Existing approaches either simplify policy learning by assuming independence across sub-actions, or attempt to learn action structure and control jointly.<n>We introduce Structured Policy Initialization (SPIN), a two-stage framework that first pre-trains an Action Structure Model (ASM) to capture the manifold of valid actions, then freezes this representation and trains lightweight policy heads for control.
arXiv Detail & Related papers (2026-01-07T22:57:21Z) - Flexible Multitask Learning with Factorized Diffusion Policy [59.526246520933135]
Multitask learning poses significant challenges due to the highly multimodal and diverse nature of robot action distributions.<n>Existing monolithic models often underfit the action distribution and lack the flexibility required for efficient adaptation.<n>We introduce a novel modular diffusion policy framework that factorizes complex action distributions into a composition of specialized diffusion models.
arXiv Detail & Related papers (2025-12-26T07:11:47Z) - Q-function Decomposition with Intervention Semantics with Factored Action Spaces [51.01244229483353]
We consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions.<n>This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms.
arXiv Detail & Related papers (2025-04-30T05:26:51Z) - An Efficient Approach for Cooperative Multi-Agent Learning Problems [0.8287206589886881]
We propose a central framework for learning a policy that models the simultaneous behavior of multiple agents.<n>Our approach addresses the coordination problem via a sequential abstraction, which overcomes the scalability problems typical to centralized methods.<n>Our experimental results demonstrate that the proposed approach successfully coordinates agents across a variety of Multi-Agent Learning environments.
arXiv Detail & Related papers (2025-04-07T09:03:35Z) - Multi Activity Sequence Alignment via Implicit Clustering [50.3168866743067]
We propose a novel framework that overcomes limitations using sequence alignment via implicit clustering.<n>Specifically, our key idea is to perform implicit clip-level clustering while aligning frames in sequences.<n>Our experiments show that our proposed method outperforms state-of-the-art results.
arXiv Detail & Related papers (2025-03-16T14:28:46Z) - Reinforcement learning with combinatorial actions for coupled restless bandits [62.89013331120493]
We propose SEQUOIA, an RL algorithm that directly optimize for long-term reward over the feasible action space.<n>We empirically validate SEQUOIA on four novel restless bandit problems with constraints: multiple interventions, path constraints, bipartite matching, and capacity constraints.
arXiv Detail & Related papers (2025-03-01T21:25:21Z) - BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces [12.904199719046968]
We propose a value-based method to evaluate a linear number of joint actions while preserving dependency structure.<n>BraVE outperforms prior offline RL methods by up to $20times$ in environments with over four million actions.
arXiv Detail & Related papers (2024-10-28T15:49:46Z) - Composable Part-Based Manipulation [61.48634521323737]
We propose composable part-based manipulation (CPM) to improve learning and generalization of robotic manipulation skills.
CPM comprises a collection of composable diffusion models, where each model captures a different inter-object correspondence.
We validate our approach in both simulated and real-world scenarios, demonstrating its effectiveness in achieving robust and generalized manipulation capabilities.
arXiv Detail & Related papers (2024-05-09T16:04:14Z) - Efficient Planning in Combinatorial Action Spaces with Applications to
Cooperative Multi-Agent Reinforcement Learning [16.844525262228103]
In cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a global reward function, which leads to a blow-up in the action space by the number of agents.
As a minimal requirement, we assume access to an argmax oracle that allows to efficiently compute the greedy policy for any Q-function in the model class.
We propose efficient algorithms for this setting that lead to compute and query complexity in all relevant problem parameters.
arXiv Detail & Related papers (2023-02-08T23:42:49Z) - Rethinking Trajectory Prediction via "Team Game" [118.59480535826094]
We present a novel formulation for multi-agent trajectory prediction, which explicitly introduces the concept of interactive group consensus.
On two multi-agent settings, i.e. team sports and pedestrians, the proposed framework consistently achieves superior performance compared to existing methods.
arXiv Detail & Related papers (2022-10-17T07:16:44Z) - Modeling Multi-Label Action Dependencies for Temporal Action
Localization [53.53490517832068]
Real-world videos contain many complex actions with inherent relationships between action classes.
We propose an attention-based architecture that models these action relationships for the task of temporal action localization in unoccurrence videos.
We show improved performance over state-of-the-art methods on multi-label action localization benchmarks.
arXiv Detail & Related papers (2021-03-04T13:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.