Multi-task UNet architecture for end-to-end autonomous driving
- URL: http://arxiv.org/abs/2112.08967v2
- Date: Fri, 8 Sep 2023 07:19:08 GMT
- Title: Multi-task UNet architecture for end-to-end autonomous driving
- Authors: Der-Hau Lee and Jinn-Liang Liu
- Abstract summary: We propose an end-to-end driving model that integrates a multi-task UNet (MTUNet) architecture and control algorithms in a pipeline of data flow from a front camera through this model to driving decisions.
It provides quantitative measures to evaluate the holistic, dynamic, and real-time performance of end-to-end driving systems and thus the safety and interpretability of MTUNet.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an end-to-end driving model that integrates a multi-task UNet
(MTUNet) architecture and control algorithms in a pipeline of data flow from a
front camera through this model to driving decisions. It provides quantitative
measures to evaluate the holistic, dynamic, and real-time performance of
end-to-end driving systems and thus the safety and interpretability of MTUNet.
The architecture consists of one segmentation, one regression, and two
classification tasks for lane segmentation, path prediction, and vehicle
controls. We present three variants of the architecture having different
complexities, compare them on different tasks in four static measures for both
single and multiple tasks, and then identify the best one by two additional
dynamic measures in real-time simulation. Our results show that the performance
of the proposed supervised learning model is comparable to that of a
reinforcement learning model on curvy roads for the same task, which is not
end-to-end but multi-module.
Related papers
- M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving [48.17490295484055]
M3Net is a novel network that simultaneously tackles detection, segmentation, and 3D occupancy prediction for autonomous driving.
M3Net achieves state-of-the-art multi-task learning performance on the nuScenes benchmarks.
arXiv Detail & Related papers (2025-03-23T15:08:09Z) - End-to-End Predictive Planner for Autonomous Driving with Consistency Models [5.966385886363771]
Trajectory prediction and planning are fundamental components for autonomous vehicles to navigate safely and efficiently in dynamic environments.
Traditionally, these components have often been treated as separate modules, limiting the ability to perform interactive planning.
We present a novel unified and data-driven framework that integrates prediction and planning with a single consistency model.
arXiv Detail & Related papers (2025-02-12T00:26:01Z) - SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection [73.49799596304418]
This paper introduces a new task called Multi-Modal datasets and Multi-Task Object Detection (M2Det) for remote sensing.
It is designed to accurately detect horizontal or oriented objects from any sensor modality.
This task poses challenges due to 1) the trade-offs involved in managing multi-modal modelling and 2) the complexities of multi-task optimization.
arXiv Detail & Related papers (2024-12-30T02:47:51Z) - StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction [22.29257945966914]
We propose a streaming and unified framework for joint 3D Multi-Object Tracking and trajectory Prediction (StreamMOTP)
We construct the model in a streaming manner and exploit a memory bank to preserve and leverage the long-term latent features for tracked objects more effectively.
We also improve the quality and consistency of predicted trajectories with a dual-stream predictor.
arXiv Detail & Related papers (2024-06-28T11:35:35Z) - MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes [1.4451387915783602]
Multi-Scenes Network (aka MS-Net) is a multi-path sparse model trained by an evolutionary process.
MS-Net selectively activates a subset of its parameters during the inference stage to produce prediction results for each scene.
Our experiment results show that MS-Net outperforms existing state-of-the-art methods on well-established pedestrian motion prediction datasets.
arXiv Detail & Related papers (2024-03-01T08:32:12Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - Visual Exemplar Driven Task-Prompting for Unified Perception in
Autonomous Driving [100.3848723827869]
We present an effective multi-task framework, VE-Prompt, which introduces visual exemplars via task-specific prompting.
Specifically, we generate visual exemplars based on bounding boxes and color-based markers, which provide accurate visual appearances of target categories.
We bridge transformer-based encoders and convolutional layers for efficient and accurate unified perception in autonomous driving.
arXiv Detail & Related papers (2023-03-03T08:54:06Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - Short-term passenger flow prediction for multi-traffic modes: A residual
network and Transformer based multi-task learning method [21.13073816634534]
Res-Transformer is a learning model for short-term passenger flow prediction of multi-traffic modes.
Model is evaluated on two large-scale real-world datasets from Beijing, China.
This paper can give critical insights into the short-tern passenger flow prediction for multi-traffic modes.
arXiv Detail & Related papers (2022-02-27T01:09:19Z) - Multi-path Neural Networks for On-device Multi-domain Visual
Classification [55.281139434736254]
This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices.
The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.
The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths.
arXiv Detail & Related papers (2020-10-10T05:13:49Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.