Parallel bootstrap-based on-policy deep reinforcement learning for
continuous flow control applications
- URL: http://arxiv.org/abs/2304.12330v3
- Date: Thu, 13 Jul 2023 12:33:44 GMT
- Title: Parallel bootstrap-based on-policy deep reinforcement learning for
continuous flow control applications
- Authors: J. Viquerat and E. Hachem
- Abstract summary: parallel environments during the learning process represent an essential ingredient to attain efficient control in a reasonable time.
We propose a parallelism pattern relying on partial-trajectory buffers terminated by a return bootstrapping step.
This approach is illustrated on a CPU-intensive continuous flow control problem from the literature.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The coupling of deep reinforcement learning to numerical flow control
problems has recently received a considerable attention, leading to
groundbreaking results and opening new perspectives for the domain. Due to the
usually high computational cost of fluid dynamics solvers, the use of parallel
environments during the learning process represents an essential ingredient to
attain efficient control in a reasonable time. Yet, most of the deep
reinforcement learning literature for flow control relies on on-policy
algorithms, for which the massively parallel transition collection may break
theoretical assumptions and lead to suboptimal control models. To overcome this
issue, we propose a parallelism pattern relying on partial-trajectory buffers
terminated by a return bootstrapping step, allowing a flexible use of parallel
environments while preserving the on-policiness of the updates. This approach
is illustrated on a CPU-intensive continuous flow control problem from the
literature.
Related papers
- Differentiable Discrete Event Simulation for Queuing Network Control [7.965453961211742]
Queueing network control poses distinct challenges, including highity, large state and action spaces, and lack of stability.
We propose a scalable framework for policy optimization based on differentiable discrete event simulation.
Our methods can flexibly handle realistic scenarios, including systems operating in non-stationary environments.
arXiv Detail & Related papers (2024-09-05T17:53:54Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Model-based deep reinforcement learning for accelerated learning from flow simulations [0.0]
We demonstrate the benefits of model-based reinforcement learning for flow control applications.
Specifically, we optimize the policy by alternating between trajectories sampled from flow simulations and trajectories sampled from an ensemble of environment models.
The model-based learning reduces the overall training time by up to $85%$ for the fluidic pinball test case.
arXiv Detail & Related papers (2024-02-26T13:01:45Z) - FlowPG: Action-constrained Policy Gradient with Normalizing Flows [14.98383953401637]
Action-constrained reinforcement learning (ACRL) is a popular approach for solving safety-critical resource-alential related decision making problems.
A major challenge in ACRL is to ensure agent taking a valid action satisfying constraints in each step.
arXiv Detail & Related papers (2024-02-07T11:11:46Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - AccFlow: Backward Accumulation for Long-Range Optical Flow [70.4251045372285]
This paper proposes a novel recurrent framework called AccFlow for long-range optical flow estimation.
We demonstrate the superiority of backward accumulation over conventional forward accumulation.
Experiments validate the effectiveness of AccFlow in handling long-range optical flow estimation.
arXiv Detail & Related papers (2023-08-25T01:51:26Z) - Deep Q-network using reservoir computing with multi-layered readout [0.0]
Recurrent neural network (RNN) based reinforcement learning (RL) is used for learning context-dependent tasks.
An approach with replay memory introducing reservoir computing has been proposed, which trains an agent without BPTT.
This paper shows that the performance of this method improves by using a multi-layered neural network for the readout layer.
arXiv Detail & Related papers (2022-03-03T00:32:55Z) - Comparative analysis of machine learning methods for active flow control [60.53767050487434]
Genetic Programming (GP) and Reinforcement Learning (RL) are gaining popularity in flow control.
This work presents a comparative analysis of the two, bench-marking some of their most representative algorithms against global optimization techniques.
arXiv Detail & Related papers (2022-02-23T18:11:19Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Single-step deep reinforcement learning for open-loop control of laminar
and turbulent flows [0.0]
This research gauges the ability of deep reinforcement learning (DRL) techniques to assist the optimization and control of fluid mechanical systems.
It combines a novel, "degenerate" version of the prototypical policy optimization (PPO) algorithm, that trains a neural network in optimizing the system only once per learning episode.
arXiv Detail & Related papers (2020-06-04T16:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.