AurigaNet: A Real-Time Multi-Task Network for Enhanced Urban Driving Perception
- URL: http://arxiv.org/abs/2602.10660v1
- Date: Wed, 11 Feb 2026 09:04:29 GMT
- Title: AurigaNet: A Real-Time Multi-Task Network for Enhanced Urban Driving Perception
- Authors: Kiarash Ghasemzadeh, Sedigheh Dehghani,
- Abstract summary: Self-driving cars hold significant potential to reduce traffic accidents, alleviate congestion, and enhance urban mobility.<n>However, developing reliable AI systems for autonomous vehicles remains a substantial challenge.<n>We present AurigaNet, an advanced multi-task network architecture designed to push the boundaries of autonomous driving perception.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-driving cars hold significant potential to reduce traffic accidents, alleviate congestion, and enhance urban mobility. However, developing reliable AI systems for autonomous vehicles remains a substantial challenge. Over the past decade, multi-task learning has emerged as a powerful approach to address complex problems in driving perception. Multi-task networks offer several advantages, including increased computational efficiency, real-time processing capabilities, optimized resource utilization, and improved generalization. In this study, we present AurigaNet, an advanced multi-task network architecture designed to push the boundaries of autonomous driving perception. AurigaNet integrates three critical tasks: object detection, lane detection, and drivable area instance segmentation. The system is trained and evaluated using the BDD100K dataset, renowned for its diversity in driving conditions. Key innovations of AurigaNet include its end-to-end instance segmentation capability, which significantly enhances both accuracy and efficiency in path estimation for autonomous vehicles. Experimental results demonstrate that AurigaNet achieves an 85.2% IoU in drivable area segmentation, outperforming its closest competitor by 0.7%. In lane detection, AurigaNet achieves a remarkable 60.8% IoU, surpassing other models by more than 30%. Furthermore, the network achieves an mAP@0.5:0.95 of 47.6% in traffic object detection, exceeding the next leading model by 2.9%. Additionally, we validate the practical feasibility of AurigaNet by deploying it on embedded devices such as the Jetson Orin NX, where it demonstrates competitive real-time performance. These results underscore AurigaNet's potential as a robust and efficient solution for autonomous driving perception systems. The code can be found here https://github.com/KiaRational/AurigaNet.
Related papers
- Advancing Autonomous Vehicle Intelligence: Deep Learning and Multimodal LLM for Traffic Sign Recognition and Robust Lane Detection [11.743721109110792]
This paper introduces an integrated approach combining advanced deep learning techniques and Multimodal Large Language Models (MLLMs) for comprehensive road perception.<n>For traffic sign recognition, we evaluate ResNet-50, Yv8, and RT-DETR, achieving state-of-the-art performance of 99.8% with ResNet-50, 98.0% accuracy with YOLOv8, and achieved 96.6% accuracy in RT-DETR.<n>For lane detection, we propose a CNN-based segmentation method enhanced by curve fitting, which delivers high accuracy under favorable conditions.
arXiv Detail & Related papers (2025-03-08T19:12:36Z) - Penalty-Based Imitation Learning With Cross Semantics Generation Sensor
Fusion for Autonomous Driving [1.2749527861829049]
In this paper, we provide a penalty-based imitation learning approach to integrate multiple modalities of information.
We observe a remarkable increase in the driving score by more than 12% when compared to the state-of-the-art (SOTA) model, InterFuser.
Our model achieves this performance enhancement while achieving a 7-fold increase in inference speed and reducing the model size by approximately 30%.
arXiv Detail & Related papers (2023-03-21T14:29:52Z) - YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception [1.6683976936678229]
Multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems.
This paper proposed an effective and efficient multi-task learning network to simultaneously perform the task of traffic object detection, drivable road area segmentation and lane detection.
Our model achieved the new state-of-the-art (SOTA) performance in terms of accuracy and speed on the challenging BDD100K dataset.
arXiv Detail & Related papers (2022-08-24T11:00:27Z) - Learning energy-efficient driving behaviors by imitating experts [75.12960180185105]
This paper examines the role of imitation learning in bridging the gap between control strategies and realistic limitations in communication and sensing.
We show that imitation learning can succeed in deriving policies that, if adopted by 5% of vehicles, may boost the energy-efficiency of networks with varying traffic conditions by 15% using only local observations.
arXiv Detail & Related papers (2022-06-28T17:08:31Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z) - HybridNets: End-to-End Perception Network [1.4287758028119788]
This paper systematically studies an end-to-end perception network for multi-tasking.
We have developed an end-to-end perception network to perform multi-tasking, including traffic object detection, drivable area segmentation and lane detection simultaneously, called HybridNets.
arXiv Detail & Related papers (2022-03-17T02:29:12Z) - An Efficient Deep Learning Approach Using Improved Generative
Adversarial Networks for Incomplete Information Completion of Self-driving [2.8504921333436832]
We propose an efficient deep learning approach to repair incomplete vehicle point cloud accurately and efficiently in autonomous driving.
The improved PF-Net can achieve the speedups of over 19x with almost the same accuracy when compared to the original PF-Net.
arXiv Detail & Related papers (2021-09-01T08:06:23Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - Efficient and Robust LiDAR-Based End-to-End Navigation [132.52661670308606]
We present an efficient and robust LiDAR-based end-to-end navigation framework.
We propose Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design.
We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass.
arXiv Detail & Related papers (2021-05-20T17:52:37Z) - Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device [53.323878851563414]
We propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques.
Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically.
The proposed framework achieves real-time 3D object detection on mobile devices with competitive detection performance.
arXiv Detail & Related papers (2020-12-26T19:41:15Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.