Subequivariant Graph Reinforcement Learning in 3D Environments
- URL: http://arxiv.org/abs/2305.18951v1
- Date: Tue, 30 May 2023 11:34:57 GMT
- Title: Subequivariant Graph Reinforcement Learning in 3D Environments
- Authors: Runfa Chen, Jiaqi Han, Fuchun Sun, Wenbing Huang
- Abstract summary: We propose a novel setup for morphology-agnostic RL, dubbed Subequivariant Graph RL in 3D environments.
Specifically, we first introduce a new set of more practical yet challenging benchmarks in 3D space.
To optimize the policy over the enlarged state-action space, we propose to inject geometric symmetry.
- Score: 34.875774768800966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning a shared policy that guides the locomotion of different agents is of
core interest in Reinforcement Learning (RL), which leads to the study of
morphology-agnostic RL. However, existing benchmarks are highly restrictive in
the choice of starting point and target point, constraining the movement of the
agents within 2D space. In this work, we propose a novel setup for
morphology-agnostic RL, dubbed Subequivariant Graph RL in 3D environments
(3D-SGRL). Specifically, we first introduce a new set of more practical yet
challenging benchmarks in 3D space that allows the agent to have full
Degree-of-Freedoms to explore in arbitrary directions starting from arbitrary
configurations. Moreover, to optimize the policy over the enlarged state-action
space, we propose to inject geometric symmetry, i.e., subequivariance, into the
modeling of the policy and Q-function such that the policy can generalize to
all directions, improving exploration efficiency. This goal is achieved by a
novel SubEquivariant Transformer (SET) that permits expressive message
exchange. Finally, we evaluate the proposed method on the proposed benchmarks,
where our method consistently and significantly outperforms existing approaches
on single-task, multi-task, and zero-shot generalization scenarios. Extensive
ablations are also conducted to verify our design. Code and videos are
available on our project page: https://alpc91.github.io/SGRL/.
Related papers
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods.
ODRL contains four experimental settings where the source and target domains can be either online or offline.
We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z) - MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis [22.80370814838661]
Recent works in volume rendering, textite.g. NeRF and 3D Gaussian Splatting (3DGS), significantly advance the rendering quality and efficiency.
We propose a new 3DGS optimization method embodying four key novel contributions.
arXiv Detail & Related papers (2024-10-02T23:48:31Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Rethinking Decision Transformer via Hierarchical Reinforcement Learning [54.3596066989024]
Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the transformer architecture in reinforcement learning (RL)
We introduce a general sequence modeling framework for studying sequential decision making through the lens of Hierarchical RL.
We show DT emerges as a special case of this framework with certain choices of high-level and low-level policies, and discuss the potential failure of these choices.
arXiv Detail & Related papers (2023-11-01T03:32:13Z) - Multi-Objective Decision Transformers for Offline Reinforcement Learning [7.386356540208436]
offline RL is structured to derive policies from static trajectory data without requiring real-time environment interactions.
We reformulate offline RL as a multi-objective optimization problem, where prediction is extended to states and returns.
Our experiments on D4RL benchmark locomotion tasks reveal that our propositions allow for more effective utilization of the attention mechanism in the transformer model.
arXiv Detail & Related papers (2023-08-31T00:47:58Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - A Game-Theoretic Perspective of Generalization in Reinforcement Learning [9.402272029807316]
Generalization in reinforcement learning (RL) is of importance for real deployment of RL algorithms.
We propose a game-theoretic framework for the generalization in reinforcement learning, named GiRL.
arXiv Detail & Related papers (2022-08-07T06:17:15Z) - Learning Off-Policy with Online Planning [18.63424441772675]
We investigate a novel instantiation of H-step lookahead with a learned model and a terminal value function.
We show the flexibility of LOOP to incorporate safety constraints during deployment with a set of navigation environments.
arXiv Detail & Related papers (2020-08-23T16:18:44Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.