Optimizing LaneSegNet for Real-Time Lane Topology Prediction in Autonomous Vehicles
- URL: http://arxiv.org/abs/2406.15946v2
- Date: Tue, 30 Jul 2024 20:15:10 GMT
- Title: Optimizing LaneSegNet for Real-Time Lane Topology Prediction in Autonomous Vehicles
- Authors: William Stevens, Vishal Urs, Karthik Selvaraj, Gabriel Torres, Gaurish Lakhanpal,
- Abstract summary: LaneSegNet is a new approach to lane topology prediction which integrates topological information with lane-line data.
This study explores optimizations to the LaneSegNet architecture through feature extractor modification and transformer encoder-decoder stack modification.
Our implementation, trained on a single NVIDIA Tesla A100 GPU, found that a 2:4 ratio reduced training time by 22.3% with only a 7.1% drop in mean average precision.
A 4:8 ratio increased training time by only 11.1% but improved mean average precision by a significant 23.7%.
- Score: 0.41942958779358663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing prevalence of autonomous vehicles, it is essential for computer vision algorithms to accurately assess road features in real-time. This study explores the LaneSegNet architecture, a new approach to lane topology prediction which integrates topological information with lane-line data to provide a more contextual understanding of road environments. The LaneSegNet architecture includes a feature extractor, lane encoder, lane decoder, and prediction head, leveraging components from ResNet-50, BEVFormer, and various attention mechanisms. We experimented with optimizations to the LaneSegNet architecture through feature extractor modification and transformer encoder-decoder stack modification. We found that modifying the encoder and decoder stacks offered an interesting tradeoff between training time and prediction accuracy, with certain combinations showing promising results. Our implementation, trained on a single NVIDIA Tesla A100 GPU, found that a 2:4 ratio reduced training time by 22.3% with only a 7.1% drop in mean average precision, while a 4:8 ratio increased training time by only 11.1% but improved mean average precision by a significant 23.7%. These results indicate that strategic hyperparameter tuning can yield substantial improvements depending on the resources of the user. This study provides valuable insights for optimizing LaneSegNet according to available computation power, making it more accessible for users with limited resources and increasing the capabilities for users with more powerful resources.
Related papers
- Prediction of Lane Change Intentions of Human Drivers using an LSTM, a CNN and a Transformer [0.33748750222488655]
Lane changes of preceding vehicles have a great impact on the motion planning of automated vehicles.<n>In this paper the structure of an LSTM, a CNN and a Transformer network are described and implemented to predict the intention of human drivers to perform a lane change.<n>The accuracy of the method spanned from $82.79%$ to $96.73%$ for different input configurations and showed overall good performances considering also precision and recall.
arXiv Detail & Related papers (2025-07-11T07:26:33Z) - Data Scaling Laws for End-to-End Autonomous Driving [83.85463296830743]
We evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours.
Specifically, we investigate how much additional training data is needed to achieve a target performance gain.
arXiv Detail & Related papers (2025-04-06T03:23:48Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - Collaborative Learning with a Drone Orchestrator [79.75113006257872]
A swarm of intelligent wireless devices train a shared neural network model with the help of a drone.
The proposed framework achieves a significant speedup in training, leading to an average 24% and 87% saving in the drone hovering time.
arXiv Detail & Related papers (2023-03-03T23:46:25Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - The AI Mechanic: Acoustic Vehicle Characterization Neural Networks [1.8275108630751837]
We introduce the AI mechanic, an acoustic vehicle characterization deep learning system, using sound captured from mobile devices.
We build a convolutional neural network that predicts and cascades vehicle attributes to enhance fault detection.
Our cascading architecture additionally achieved 93.6% validation and 86.8% test set accuracy on misfire fault prediction, demonstrating margins of 16.4% / 7.8% and 4.2% / 1.5% improvement over na"ive and parallel baselines.
arXiv Detail & Related papers (2022-05-19T16:29:26Z) - Predicting highway lane-changing maneuvers: A benchmark analysis of
machine and ensemble learning algorithms [0.0]
We compare different machine and ensemble learning classification techniques to the rule-based model.
We predict two types of discretionary lane-change maneuvers: Overtaking (from slow to fast lane) and fold-down.
If the rule-based model provides limited predicting accuracy, especially in case of fold-down, the data-based algorithms, devoid of modeling bias, allow significant prediction improvements.
arXiv Detail & Related papers (2022-04-20T22:55:59Z) - HybridNets: End-to-End Perception Network [1.4287758028119788]
This paper systematically studies an end-to-end perception network for multi-tasking.
We have developed an end-to-end perception network to perform multi-tasking, including traffic object detection, drivable area segmentation and lane detection simultaneously, called HybridNets.
arXiv Detail & Related papers (2022-03-17T02:29:12Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Predicting the Time Until a Vehicle Changes the Lane Using LSTM-based
Recurrent Neural Networks [0.5399800035598186]
This paper deals with the development of a system that accurately predicts the time to the next lane change of surrounding vehicles on highways.
An evaluation based on a large real-world data set shows that our approach is able to make reliable predictions, even in the most challenging situations.
arXiv Detail & Related papers (2021-02-02T11:04:22Z) - Res-GCNN: A Lightweight Residual Graph Convolutional Neural Networks for
Human Trajectory Forecasting [0.0]
We propose a Residual Graph Convolutional Neural Network (Res-GCNN), which models the interactive behaviors of pedes-trians.
Results show an improvement over the state of art by 13.3% on the Final Displacement Error (FDE) which reaches 0.65 meter.
The code will be made publicly available on GitHub.
arXiv Detail & Related papers (2020-11-18T11:18:16Z) - APQ: Joint Search for Network Architecture, Pruning and Quantization
Policy [49.3037538647714]
We present APQ for efficient deep learning inference on resource-constrained hardware.
Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner.
With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ.
arXiv Detail & Related papers (2020-06-15T16:09:17Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.