CARNet: A Dynamic Autoencoder for Learning Latent Dynamics in Autonomous
Driving Tasks
- URL: http://arxiv.org/abs/2205.08712v1
- Date: Wed, 18 May 2022 04:15:42 GMT
- Title: CARNet: A Dynamic Autoencoder for Learning Latent Dynamics in Autonomous
Driving Tasks
- Authors: Andrey Pak, Hemanth Manjunatha, Dimitar Filev, Panagiotis Tsiotras
- Abstract summary: An autonomous driving system should effectively use the information collected from the various sensors in order to form an abstract description of the world.
Deep learning models, such as autoencoders, can be used for that purpose, as they can learn compact latent representations from a stream of incoming data.
This work proposes CARNet, a Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder combined with a recurrent neural network to learn the current latent representation.
- Score: 11.489187712465325
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Autonomous driving has received a lot of attention in the automotive industry
and is often seen as the future of transportation. Passenger vehicles equipped
with a wide array of sensors (e.g., cameras, front-facing radars, LiDARs, and
IMUs) capable of continuous perception of the environment are becoming
increasingly prevalent. These sensors provide a stream of high-dimensional,
temporally correlated data that is essential for reliable autonomous driving.
An autonomous driving system should effectively use the information collected
from the various sensors in order to form an abstract description of the world
and maintain situational awareness. Deep learning models, such as autoencoders,
can be used for that purpose, as they can learn compact latent representations
from a stream of incoming data. However, most autoencoder models process the
data independently, without assuming any temporal interdependencies. Thus,
there is a need for deep learning models that explicitly consider the temporal
dependence of the data in their architecture. This work proposes CARNet, a
Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder
combined with a recurrent neural network to learn the current latent
representation and, in addition, also predict future latent representations in
the context of autonomous driving. We demonstrate the efficacy of the proposed
model in both imitation and reinforcement learning settings using both
simulated and real datasets. Our results show that the proposed model
outperforms the baseline state-of-the-art model, while having significantly
fewer trainable parameters.
Related papers
- Guiding Attention in End-to-End Driving Models [49.762868784033785]
Vision-based end-to-end driving models trained by imitation learning can lead to affordable solutions for autonomous driving.
We study how to guide the attention of these models to improve their driving quality by adding a loss term during training.
In contrast to previous work, our method does not require these salient semantic maps to be available during testing time.
arXiv Detail & Related papers (2024-04-30T23:18:51Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - VREM-FL: Mobility-Aware Computation-Scheduling Co-Design for Vehicular Federated Learning [2.6322811557798746]
vehicular radio environment map federated learning (VREM-FL) is proposed.
It combines mobility of vehicles with 5G radio environment maps.
VREM-FL can be tuned to trade training time for radio resource usage.
arXiv Detail & Related papers (2023-11-30T17:38:54Z) - KARNet: Kalman Filter Augmented Recurrent Neural Network for Learning
World Models in Autonomous Driving Tasks [11.489187712465325]
We present a Kalman filter augmented recurrent neural network architecture to learn the latent representation of the traffic flow using front camera images only.
Results show that incorporating an explicit model of the vehicle (states estimated using Kalman filtering) in the end-to-end learning significantly increases performance.
arXiv Detail & Related papers (2023-05-24T02:27:34Z) - Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction [4.640835690336652]
We present ContextVAE, a context-aware approach for multi-modal vehicle trajectory prediction.
Our approach takes into account both the social features exhibited by agents on the scene and the physical environment constraints.
In all tested datasets, ContextVAE models are fast to train and provide high-quality multi-modal predictions in real-time.
arXiv Detail & Related papers (2023-02-21T18:42:24Z) - Generative AI-empowered Simulation for Autonomous Driving in Vehicular
Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome.
Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly.
We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z) - Predicting Take-over Time for Autonomous Driving with Real-World Data:
Robust Data Augmentation, Models, and Evaluation [11.007092387379076]
We develop and train take-over time (TOT) models that operate on mid and high-level features produced by computer vision algorithms operating on different driver-facing camera views.
We show that a TOT model supported by augmented data can be used to produce continuous estimates of take-over times without delay.
arXiv Detail & Related papers (2021-07-27T16:39:50Z) - Improving Generalization of Transfer Learning Across Domains Using
Spatio-Temporal Features in Autonomous Driving [45.655433907239804]
Vehicle simulation can be used to learn in the virtual world, and the acquired skills can be transferred to handle real-world scenarios.
These visual elements are intuitively crucial for human decision making during driving.
We propose a CNN+LSTM transfer learning framework to extract thetemporal-temporal features representing vehicle dynamics from scenes.
arXiv Detail & Related papers (2021-03-15T03:26:06Z) - IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment.
Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.