An Analytical Framework to Enhance Autonomous Vehicle Perception for Smart Cities
- URL: http://arxiv.org/abs/2510.13230v1
- Date: Wed, 15 Oct 2025 07:34:22 GMT
- Title: An Analytical Framework to Enhance Autonomous Vehicle Perception for Smart Cities
- Authors: Jalal Khan, Manzoor Khan, Sherzod Turaev, Sumbal Malik, Hesham El-Sayed, Farman Ullah,
- Abstract summary: There is a need to develop a model that accurately perceives multiple objects on the road and predicts the driver's perception to control the car's movements.<n>This article proposes a novel utility-based analytical model that enables perception systems of AVs to understand the driving environment.
- Score: 1.9923531555025622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The driving environment perception has a vital role for autonomous driving and nowadays has been actively explored for its realization. The research community and relevant stakeholders necessitate the development of Deep Learning (DL) models and AI-enabled solutions to enhance autonomous vehicles (AVs) for smart mobility. There is a need to develop a model that accurately perceives multiple objects on the road and predicts the driver's perception to control the car's movements. This article proposes a novel utility-based analytical model that enables perception systems of AVs to understand the driving environment. The article consists of modules: acquiring a custom dataset having distinctive objects, i.e., motorcyclists, rickshaws, etc; a DL-based model (YOLOv8s) for object detection; and a module to measure the utility of perception service from the performance values of trained model instances. The perception model is validated based on the object detection task, and its process is benchmarked by state-of-the-art deep learning models' performance metrics from the nuScense dataset. The experimental results show three best-performing YOLOv8s instances based on mAP@0.5 values, i.e., SGD-based (0.832), Adam-based (0.810), and AdamW-based (0.822). However, the AdamW-based model (i.e., car: 0.921, motorcyclist: 0.899, truck: 0.793, etc.) still outperforms the SGD-based model (i.e., car: 0.915, motorcyclist: 0.892, truck: 0.781, etc.) because it has better class-level performance values, confirmed by the proposed perception model. We validate that the proposed function is capable of finding the right perception for AVs. The results above encourage using the proposed perception model to evaluate the utility of learning models and determine the appropriate perception for AVs.
Related papers
- Comparative Analysis of Deep Learning Models for Perception in Autonomous Vehicles [0.0]
We compare the performance of DL models, including YOLO-NAS and YOLOv8, for a detection-based perception task.<n>Our analysis reveals that the YOLOv8s model saves 75% of training time compared to the YOLO-NAS model.
arXiv Detail & Related papers (2025-12-25T13:33:23Z) - HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation [13.299893784290733]
HAD-Gen is a framework for realistic traffic scenario generation that simulates diverse human-like driving behaviors.<n>The proposed framework achieves a 90.96% goal-reaching rate, an off-road rate of 2.08%, and a collision rate of 6.91% in the generalization test.
arXiv Detail & Related papers (2025-03-19T09:38:45Z) - Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles [2.099922236065961]
This study investigates the car-following behaviours of three vehicle pairs: HDV-AV, AV-HDV and HDV-HDV in mixed traffic.
We introduce a data-driven Knowledge Distillation Neural Network (KDNN) model for predicting car-following behaviour in terms of speed.
arXiv Detail & Related papers (2024-11-08T14:57:59Z) - MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower.
We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events.
We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z) - Guiding Attention in End-to-End Driving Models [49.762868784033785]
Vision-based end-to-end driving models trained by imitation learning can lead to affordable solutions for autonomous driving.
We study how to guide the attention of these models to improve their driving quality by adding a loss term during training.
In contrast to previous work, our method does not require these salient semantic maps to be available during testing time.
arXiv Detail & Related papers (2024-04-30T23:18:51Z) - Foundation Models for Structural Health Monitoring [14.36493796970864]
We propose for the first time the use of Transformer neural networks, with a Masked Auto-Encoder architecture, as Foundation Models for Structural Health Monitoring.<n>We demonstrate the ability of these models to learn generalizable representations from multiple large datasets through self-supervised pre-training.<n>We showcase the effectiveness of our foundation models using data from three operational viaducts.
arXiv Detail & Related papers (2024-04-03T13:32:44Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - Trajeglish: Traffic Modeling as Next-Token Prediction [67.28197954427638]
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs.
We apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios.
Our model tops the Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%.
arXiv Detail & Related papers (2023-12-07T18:53:27Z) - Learning An Active Inference Model of Driver Perception and Control: Application to Vehicle Car-Following [9.837204436270811]
We introduce a general estimation methodology for learning a model of human perception and control in a sensorimotor control task.<n>We consider a model's structure specification consistent with active inference, a theory of human perception and behavior from cognitive science.
arXiv Detail & Related papers (2023-03-27T13:39:26Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Building Trust in Autonomous Vehicles: Role of Virtual Reality Driving
Simulators in HMI Design [8.39368916644651]
We propose a methodology to validate the user experience in AVs based on continuous, objective information gathered from physiological signals.
We applied this methodology to the design of a head-up display interface delivering visual cues about the vehicle's sensory and planning systems.
arXiv Detail & Related papers (2020-07-27T08:42:07Z) - An LSTM-Based Autonomous Driving Model Using Waymo Open Dataset [7.151393153761375]
This paper introduces an approach to learn a short-term memory (LSTM)-based model for imitating the behavior of a self-driving model.
The experimental results show that our model outperforms several models in driving action prediction.
arXiv Detail & Related papers (2020-02-14T05:28:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.