Highway Graph to Accelerate Reinforcement Learning
- URL: http://arxiv.org/abs/2405.11727v2
- Date: Tue, 07 Jan 2025 15:26:14 GMT
- Title: Highway Graph to Accelerate Reinforcement Learning
- Authors: Zidu Yin, Zhen Zhang, Dong Gong, Stefano V. Albrecht, Javen Q. Shi,
- Abstract summary: Reinforcement Learning algorithms often struggle with low training efficiency.<n>We introduce the highway graph to model state transitions.<n>Our method learns significantly faster than established and state-of-the-art RL algorithms.
- Score: 18.849312069946993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) algorithms often struggle with low training efficiency. A common approach to address this challenge is integrating model-based planning algorithms, such as Monte Carlo Tree Search (MCTS) or Value Iteration (VI), into the environmental model. However, VI requires iterating over a large tensor which updates the value of the preceding state based on the succeeding state through value propagation, resulting in computationally intensive operations. To enhance the RL training efficiency, we propose improving the efficiency of the value learning process. In deterministic environments with discrete state and action spaces, we observe that on the sampled empirical state-transition graph, a non-branching sequence of transitions-termed a highway-can take the agent to another state without deviation through intermediate states. On these non-branching highways, the value-updating process can be streamlined into a single-step operation, eliminating the need for step-by-step updates. Building on this observation, we introduce the highway graph to model state transitions. The highway graph compresses the transition model into a compact representation, where edges can encapsulate multiple state transitions, enabling value propagation across multiple time steps in a single iteration. By integrating the highway graph into RL, the training process is significantly accelerated, particularly in the early stages of training. Experiments across four categories of environments demonstrate that our method learns significantly faster than established and state-of-the-art RL algorithms (often by a factor of 10 to 150) while maintaining equal or superior expected returns. Furthermore, a deep neural network-based agent trained using the highway graph exhibits improved generalization capabilities and reduced storage costs. Code is publicly available at https://github.com/coodest/highwayRL.
Related papers
- MacLight: Multi-scene Aggregation Convolutional Learning for Traffic Signal Control [8.342432309172757]
Reinforcement learning methods have proposed promising traffic signal control policy that can be trained on large road networks.
Current SOTA methods model road networks as topological graph structures, incorporate graph attention into deep Q-learning, and merge local and global embeddings to improve policy.
We propose Multi-Scene Aggregation Convolutional Learning for traffic signal control (MacLight), which offers faster training speeds and more stable performance.
arXiv Detail & Related papers (2024-12-20T09:26:41Z) - Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem.
Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z) - DeMo: Decoupled Momentum Optimization [6.169574689318864]
Training large neural networks typically requires sharing between accelerators through specialized high-speed interconnects.
We introduce bfDecoupled textbfMomentum (DeMo), a fused magnitude and data parallel algorithm that reduces inter-accelerator communication requirements.
Empirical results show that models trained with DeMo match or exceed the performance of equivalent models trained with AdamW.
arXiv Detail & Related papers (2024-11-29T17:31:47Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Efficient and Effective Implicit Dynamic Graph Neural Network [42.49148111696576]
We present Implicit Dynamic Graph Neural Network (IDGNN) a novel implicit neural network for dynamic graphs.
A key characteristic of IDGNN is that it demonstrably is well-posed, i.e., it is theoretically guaranteed to have a fixed-point representation.
arXiv Detail & Related papers (2024-06-25T19:07:21Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs
Transformation [0.0]
graph superoptimisation systems perform a sequence of subgraph substitution to neural networks to find the optimal computation graph structure.
We show that our approach can outperform state-of-the-art superoptimisation systems over a range of deep learning models and achieve by up to 40% on those that are based on transformer-style architectures.
arXiv Detail & Related papers (2023-04-28T09:06:18Z) - ADLight: A Universal Approach of Traffic Signal Control with Augmented
Data Using Reinforcement Learning [3.3458830284045065]
We propose a novel reinforcement learning approach with augmented data (ADLight) to train a universal model for intersections with different structures.
A new data augmentation method named textitmovement shuffle is developed to improve the generalization performance.
The results show that the performance of our approach is close to the models trained in a single environment directly.
arXiv Detail & Related papers (2022-10-24T16:21:48Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - Efficient Neural Network Training via Forward and Backward Propagation
Sparsification [26.301103403328312]
We propose an efficient sparse training method with completely sparse forward and backward passes.
Our algorithm is much more effective in accelerating the training process, up to an order of magnitude faster.
arXiv Detail & Related papers (2021-11-10T13:49:47Z) - Graph Signal Restoration Using Nested Deep Algorithm Unrolling [85.53158261016331]
Graph signal processing is a ubiquitous task in many applications such as sensor, social transportation brain networks, point cloud processing, and graph networks.
We propose two restoration methods based on convexindependent deep ADMM (ADMM)
parameters in the proposed restoration methods are trainable in an end-to-end manner.
arXiv Detail & Related papers (2021-06-30T08:57:01Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for
Reinforcement Learning [84.30765628008207]
We propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning.
Our method outperforms the current state-of-the-art methods by a large margin on both benchmarks.
arXiv Detail & Related papers (2021-06-08T07:37:37Z) - Accurate, Efficient and Scalable Training of Graph Neural Networks [9.569918335816963]
Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs.
It is still challenging to perform training in an efficient and scalable way.
We propose a novel parallel training framework that reduces training workload by orders of magnitude compared with state-of-the-art minibatch methods.
arXiv Detail & Related papers (2020-10-05T22:06:23Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.