Scalable Multi-modal Model Predictive Control via Duality-based Interaction Predictions
- URL: http://arxiv.org/abs/2402.01116v4
- Date: Sun, 2 Jun 2024 23:01:08 GMT
- Title: Scalable Multi-modal Model Predictive Control via Duality-based Interaction Predictions
- Authors: Hansung Kim, Siddharth H. Nair, Francesco Borrelli,
- Abstract summary: RAID-Net is a novel attention-based Recurrent Neural Network that predicts relevant interactions along the Model Predictive Control (MPC) prediction horizon.
Our approach is demonstrated in a simulated traffic horizon with interactive surrounding vehicles, showcasing a 12x speed-up in solving the motion planning problem.
- Score: 8.256630421682951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a hierarchical architecture designed for scalable real-time Model Predictive Control (MPC) in complex, multi-modal traffic scenarios. This architecture comprises two key components: 1) RAID-Net, a novel attention-based Recurrent Neural Network that predicts relevant interactions along the MPC prediction horizon between the autonomous vehicle and the surrounding vehicles using Lagrangian duality, and 2) a reduced Stochastic MPC problem that eliminates irrelevant collision avoidance constraints, enhancing computational efficiency. Our approach is demonstrated in a simulated traffic intersection with interactive surrounding vehicles, showcasing a 12x speed-up in solving the motion planning problem. A video demonstrating the proposed architecture in multiple complex traffic scenarios can be found here: https://youtu.be/-pRiOnPb9_c. GitHub: https://github.com/MPC-Berkeley/hmpc_raidnet
Related papers
- Towards Intelligent Transportation with Pedestrians and Vehicles In-the-Loop: A Surveillance Video-Assisted Federated Digital Twin Framework [62.47416496137193]
We propose a surveillance video assisted federated digital twin (SV-FDT) framework to empower ITSs with pedestrians and vehicles in-the-loop.
The architecture consists of three layers: (i) the end layer, which collects traffic surveillance videos from multiple sources; (ii) the edge layer, responsible for semantic segmentation-based visual understanding, twin agent-based interaction modeling, and local digital twin system (LDTS) creation in local regions; and (iii) the cloud layer, which integrates LDTSs across different regions to construct a global DT model in realtime.
arXiv Detail & Related papers (2025-03-06T07:36:06Z) - A Coalition Game for On-demand Multi-modal 3D Automated Delivery System [4.378407481656902]
We introduce a multi-modal autonomous delivery optimization framework as a coalition game for a fleet of UAVs and ADRs operating in two overlaying networks.
The framework addresses last-mile delivery in urban environments, including high-density areas, road-based routing, and real-world operational challenges.
arXiv Detail & Related papers (2024-12-23T03:50:29Z) - End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning [24.578178308010912]
We propose an end-to-end model-based RL algorithm named Ramble to address these issues.
By learning a dynamics model of the environment, Ramble can foresee upcoming traffic events and make more informed, strategic decisions.
Ramble achieves state-of-the-art performance regarding route completion rate and driving score on the CARLA Leaderboard 2.0, showcasing its effectiveness in managing complex and dynamic traffic situations.
arXiv Detail & Related papers (2024-10-03T06:45:59Z) - DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - Generalized Multi-Objective Reinforcement Learning with Envelope Updates in URLLC-enabled Vehicular Networks [12.323383132739195]
We develop a novel multi-objective reinforcement learning framework to jointly optimize wireless network selection and autonomous driving policies.
The proposed framework is designed to maximize the traffic flow and minimize collisions by controlling the vehicle's motion dynamics.
The proposed policies enable autonomous vehicles to adopt safe driving behaviors with improved connectivity.
arXiv Detail & Related papers (2024-05-18T16:31:32Z) - AccidentBlip: Agent of Accident Warning based on MA-former [24.81148840857782]
AccidentBlip is a vision-only framework that employs our self-designed Motion Accident Transformer (MA-former) to process each frame of video.
AccidentBlip achieves performance in both accident detection and prediction tasks on the DeepAccident dataset.
It also outperforms current SOTA methods in V2V and V2X scenarios, demonstrating a superior capability to understand complex real-world environments.
arXiv Detail & Related papers (2024-04-18T12:54:25Z) - Pixel State Value Network for Combined Prediction and Planning in
Interactive Environments [9.117828575880303]
This work proposes a deep learning methodology to combine prediction and planning.
A conditional GAN with the U-Net architecture is trained to predict two high-resolution image sequences.
Results demonstrate intuitive behavior in complex situations, such as lane changes amidst conflicting objectives.
arXiv Detail & Related papers (2023-10-11T17:57:13Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z) - A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition.
We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z) - Spatio-Temporal Look-Ahead Trajectory Prediction using Memory Neural
Network [6.065344547161387]
This paper attempts to solve the problem of Spatio-temporal look-ahead trajectory prediction using a novel recurrent neural network called the Memory Neuron Network.
The proposed model is computationally less intensive and has a simple architecture as compared to other deep learning models that utilize LSTMs and GRUs.
arXiv Detail & Related papers (2021-02-24T05:02:19Z) - Multi-intersection Traffic Optimisation: A Benchmark Dataset and a
Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas.
Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent.
We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Towards Automated Neural Interaction Discovery for Click-Through Rate
Prediction [64.03526633651218]
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems.
We propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR.
arXiv Detail & Related papers (2020-06-29T04:33:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.