Multi-Timescale Control and Communications with Deep Reinforcement
Learning -- Part I: Communication-Aware Vehicle Control
- URL: http://arxiv.org/abs/2311.11281v1
- Date: Sun, 19 Nov 2023 09:51:58 GMT
- Title: Multi-Timescale Control and Communications with Deep Reinforcement
Learning -- Part I: Communication-Aware Vehicle Control
- Authors: Tong Liu, Lei Lei, Kan Zheng, Xuemin (Sherman) Shen
- Abstract summary: We propose a joint optimization framework of multi-timescale control and communications based on Deep Reinforcement Learning (DRL)
In this paper (Part I), we first decompose the problem into a communication-aware DRL-based PC sub-problem and a control-aware DRL-based RRA sub-problem.
To improve the PC performance under random observation delay, the PC state space is augmented with the observation delay and PC action history.
It is proved that the optimal policy for the augmented state MDP is optimal for the original PC problem with observation delay.
- Score: 15.390800228536536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An intelligent decision-making system enabled by Vehicle-to-Everything (V2X)
communications is essential to achieve safe and efficient autonomous driving
(AD), where two types of decisions have to be made at different timescales,
i.e., vehicle control and radio resource allocation (RRA) decisions. The
interplay between RRA and vehicle control necessitates their collaborative
design. In this two-part paper (Part I and Part II), taking platoon control
(PC) as an example use case, we propose a joint optimization framework of
multi-timescale control and communications (MTCC) based on Deep Reinforcement
Learning (DRL). In this paper (Part I), we first decompose the problem into a
communication-aware DRL-based PC sub-problem and a control-aware DRL-based RRA
sub-problem. Then, we focus on the PC sub-problem assuming an RRA policy is
given, and propose the MTCC-PC algorithm to learn an efficient PC policy. To
improve the PC performance under random observation delay, the PC state space
is augmented with the observation delay and PC action history. Moreover, the
reward function with respect to the augmented state is defined to construct an
augmented state Markov Decision Process (MDP). It is proved that the optimal
policy for the augmented state MDP is optimal for the original PC problem with
observation delay. Different from most existing works on communication-aware
control, the MTCC-PC algorithm is trained in a delayed environment generated by
the fine-grained embedded simulation of C-V2X communications rather than by a
simple stochastic delay model. Finally, experiments are performed to compare
the performance of MTCC-PC with those of the baseline DRL algorithms.
Related papers
- Event-Triggered Reinforcement Learning Based Joint Resource Allocation for Ultra-Reliable Low-Latency V2X Communications [10.914558012458425]
6G-enabled vehicular networks face the challenge ensuring low-latency communication (URLLC) for delivering safety-critical information in a timely manner.
Traditional resource allocation schemes for vehicle-to-everything (V2X) communication systems rely on traditional decoding-based algorithms.
arXiv Detail & Related papers (2024-07-18T23:55:07Z) - Deployable Reinforcement Learning with Variable Control Rate [14.838483990647697]
We propose a variant of Reinforcement Learning (RL) with variable control rate.
In this approach, the policy decides the action the agent should take as well as the duration of the time step associated with that action.
We show the efficacy of SEAC through a proof-of-concept simulation driving an agent with Newtonian kinematics.
arXiv Detail & Related papers (2024-01-17T15:40:11Z) - Multi-Timescale Control and Communications with Deep Reinforcement
Learning -- Part II: Control-Aware Radio Resource Allocation [15.390800228536536]
We decomposed the multi-timescale control and communications problem in C-V2X system.
We proposed the MTCC-PC algorithm to learn an optimal PC policy given an RRA policy.
In this paper (Part II), we first focus on the RRA sub-problem in MTCC assuming a PC policy is given, and propose the MTCC-RRA algorithm to learn the RRA policy.
arXiv Detail & Related papers (2023-11-19T09:50:21Z) - Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning
Framework for Congestion Control in Tactical Environments [53.08686495706487]
This paper proposes an RL framework that leverages an accurate and parallelizable emulation environment to reenact the conditions of a tactical network.
We evaluate our RL learning framework by training a MARLIN agent in conditions replicating a bottleneck link transition between a Satellite Communication (SATCOM) and an UHF Wide Band (UHF) radio link.
arXiv Detail & Related papers (2023-06-27T16:15:15Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Fair and Efficient Distributed Edge Learning with Hybrid Multipath TCP [62.81300791178381]
The bottleneck of distributed edge learning over wireless has shifted from computing to communication.
Existing TCP-based data networking schemes for DEL are application-agnostic and fail to deliver adjustments according to application layer requirements.
We develop a hybrid multipath TCP (MP TCP) by combining model-based and deep reinforcement learning (DRL) based MP TCP for DEL.
arXiv Detail & Related papers (2022-11-03T09:08:30Z) - Development of a CAV-based Intersection Control System and Corridor
Level Impact Assessment [0.696125353550498]
This paper presents a signal-free intersection control system for CAVs by combination of a pixel reservation algorithm and a Deep Reinforcement Learning (DRL) decision-making logic.
The proposed model reduces delay by 50%, 29%, and 23% in moderate, high, and extreme volume regimes compared to the other CAV-based control system.
arXiv Detail & Related papers (2022-08-21T21:56:20Z) - Deep Reinforcement Learning Aided Platoon Control Relying on V2X
Information [78.18186960475974]
The impact of Vehicle-to-Everything (V2X) communications on platoon control performance is investigated.
Our objective is to find the specific set of information that should be shared among the vehicles for the construction of the most appropriate state space.
More meritorious information is given higher priority in transmission, since including it in the state space has a higher probability in offsetting the negative effect of having higher state dimensions.
arXiv Detail & Related papers (2022-03-28T02:11:54Z) - AI-aided Traffic Control Scheme for M2M Communications in the Internet
of Vehicles [61.21359293642559]
The dynamics of traffic and the heterogeneous requirements of different IoV applications are not considered in most existing studies.
We consider a hybrid traffic control scheme and use proximal policy optimization (PPO) method to tackle it.
arXiv Detail & Related papers (2022-03-05T10:54:05Z) - Path Design and Resource Management for NOMA enhanced Indoor Intelligent
Robots [58.980293789967575]
A communication enabled indoor intelligent robots (IRs) service framework is proposed.
Lego modeling method is proposed, which can deterministically describe the indoor layout and channel state.
The investigated radio map is invoked as a virtual environment to train the reinforcement learning agent.
arXiv Detail & Related papers (2020-11-23T21:45:01Z) - Combining Reinforcement Learning with Model Predictive Control for
On-Ramp Merging [10.480121529429631]
Two broad classes of techniques have been proposed to solve motion planning problems in autonomous driving: Model Predictive Control (MPC) and Reinforcement Learning (RL)
We first establish the strengths and weaknesses of state-of-the-art MPC and RL-based techniques through simulations.
We subsequently present an algorithm which blends the model-free RL agent with the MPC solution and show that it provides better trade-offs between all metrics -- passenger comfort, efficiency, crash rate and robustness.
arXiv Detail & Related papers (2020-11-17T07:42:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.