RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation
Research
- URL: http://arxiv.org/abs/2303.13117v1
- Date: Thu, 23 Mar 2023 09:07:30 GMT
- Title: RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation
Research
- Authors: Ching Pui Wan, Tung Li, Jason Min Wang
- Abstract summary: We introduce RLOR, a flexible framework for Deep Reinforcement Learning for Operation Research.
We analyze the end-to-end autoregressive models for vehicle routing problems and show that these models can benefit from the recent advances in reinforcement learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning has been applied in operation research and has shown
promise in solving large combinatorial optimization problems. However, existing
works focus on developing neural network architectures for certain problems.
These works lack the flexibility to incorporate recent advances in
reinforcement learning, as well as the flexibility of customizing model
architectures for operation research problems. In this work, we analyze the
end-to-end autoregressive models for vehicle routing problems and show that
these models can benefit from the recent advances in reinforcement learning
with a careful re-implementation of the model architecture. In particular, we
re-implemented the Attention Model and trained it with Proximal Policy
Optimization (PPO) in CleanRL, showing at least 8 times speed up in training
time. We hereby introduce RLOR, a flexible framework for Deep Reinforcement
Learning for Operation Research. We believe that a flexible framework is key to
developing deep reinforcement learning models for operation research problems.
The code of our work is publicly available at https://github.com/cpwan/RLOR.
Related papers
- Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs [58.18140409409302]
Large Language Models (LLMs) have made substantial strides in structured tasks through Reinforcement Learning (RL)
Applying RL in broader domains like chatbots and content generation presents unique challenges.
We show a case study of reproducing existing reward model ensemble research using embedding-based reward models.
arXiv Detail & Related papers (2025-02-04T19:37:35Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of very large neural networks.
NNsight is an open-source system that extends PyTorch to introduce deferred remote execution.
NDIF is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions [2.50194939587674]
We propose a new Model-based RL framework to enable efficient policy learning with unknown dynamics.
We introduce and explore a novel method for adding safety constraints for model-based RL during training and policy learning.
arXiv Detail & Related papers (2024-05-25T11:21:12Z) - Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment [69.33930972652594]
We propose a novel structural pruning approach to jointly learn the weights and structurally prune architectures of CNN models.
The core element of our method is a Reinforcement Learning (RL) agent whose actions determine the pruning ratios of the CNN model's layers.
We conduct the joint training and pruning by iteratively training the model's weights and the agent's policy.
arXiv Detail & Related papers (2024-03-28T15:22:29Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z) - Reinforcement Learning for Topic Models [3.42658286826597]
We apply reinforcement learning techniques to topic modeling by replacing the variational autoencoder in ProdLDA with a continuous action space reinforcement learning policy.
We introduce several modifications: modernize the neural network architecture, weight the ELBO loss, use contextual embeddings, and monitor the learning process via computing topic diversity and coherence.
arXiv Detail & Related papers (2023-05-08T16:41:08Z) - RLFlow: Optimising Neural Network Subgraph Transformation with World
Models [0.0]
We propose a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime.
We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
arXiv Detail & Related papers (2022-05-03T11:52:54Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - A Design Space Study for LISTA and Beyond [79.76740811464597]
In recent years, great success has been witnessed in building problem-specific deep networks from unrolling iterative algorithms.
This paper revisits the role of unrolling as a design approach for deep networks, to what extent its resulting special architecture is superior, and can we find better?
Using LISTA for sparse recovery as a representative example, we conduct the first thorough design space study for the unrolled models.
arXiv Detail & Related papers (2021-04-08T23:01:52Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.