Object Detection using Oriented Window Learning Vi-sion Transformer: Roadway Assets Recognition
- URL: http://arxiv.org/abs/2406.10712v1
- Date: Sat, 15 Jun 2024 18:49:42 GMT
- Title: Object Detection using Oriented Window Learning Vi-sion Transformer: Roadway Assets Recognition
- Authors: Taqwa Alhadidi, Ahmed Jaber, Shadi Jaradat, Huthaifa I Ashqar, Mohammed Elhenawy,
- Abstract summary: The Oriented Window Learning Vision Transformer (OWL-ViT) offers a novel approach by adapting window orientations to the geometry and existence of objects.
This study leverages OWL-ViT within a one-shot learning framework to recognize transportation infrastructure components, such as traffic signs, poles, pavement, and cracks.
- Score: 4.465427147188149
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object detection is a critical component of transportation systems, particularly for applications such as autonomous driving, traffic monitoring, and infrastructure maintenance. Traditional object detection methods often struggle with limited data and variability in object appearance. The Oriented Window Learning Vision Transformer (OWL-ViT) offers a novel approach by adapting window orientations to the geometry and existence of objects, making it highly suitable for detecting diverse roadway assets. This study leverages OWL-ViT within a one-shot learning framework to recognize transportation infrastructure components, such as traffic signs, poles, pavement, and cracks. This study presents a novel method for roadway asset detection using OWL-ViT. We conducted a series of experiments to evaluate the performance of the model in terms of detection consistency, semantic flexibility, visual context adaptability, resolution robustness, and impact of non-max suppression. The results demonstrate the high efficiency and reliability of the OWL-ViT across various scenarios, underscoring its potential to enhance the safety and efficiency of intelligent transportation systems.
Related papers
- Deep Active Perception for Object Detection using Navigation Proposals [39.52573252842573]
We propose a generic supervised active perception pipeline for object detection.
It can be trained using existing off-the-shelf object detectors, while also leveraging advances in simulation environments.
The proposed method was evaluated on synthetic datasets, constructed within the Webots robotics simulator.
arXiv Detail & Related papers (2023-12-15T20:55:52Z) - Efficient Vision Transformer for Accurate Traffic Sign Detection [0.0]
This research paper addresses the challenges associated with traffic sign detection in self-driving vehicles and driver assistance systems.
It introduces the application of the Transformer model, particularly the Vision Transformer variants, to tackle this task.
To enhance the efficiency of the Transformer model, the research proposes a novel strategy that integrates a locality inductive bias and a transformer module.
arXiv Detail & Related papers (2023-11-02T17:44:32Z) - DARTH: Holistic Test-time Adaptation for Multiple Object Tracking [87.72019733473562]
Multiple object tracking (MOT) is a fundamental component of perception systems for autonomous driving.
Despite the urge of safety in driving systems, no solution to the MOT adaptation problem to domain shift in test-time conditions has ever been proposed.
We introduce DARTH, a holistic test-time adaptation framework for MOT.
arXiv Detail & Related papers (2023-10-03T10:10:42Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Effects of Real-Life Traffic Sign Alteration on YOLOv7- an Object
Recognition Model [1.6334452280183571]
This study investigates the influence of altered traffic signs on the accuracy and effectiveness of object recognition.
It employs a publicly available dataset to introduce alterations in shape, color, content, visibility, angles and background.
The study demonstrates a notable decline in detection and classification accuracy when confronted with traffic signs in unusual conditions.
arXiv Detail & Related papers (2023-05-09T14:51:29Z) - Learning energy-efficient driving behaviors by imitating experts [75.12960180185105]
This paper examines the role of imitation learning in bridging the gap between control strategies and realistic limitations in communication and sensing.
We show that imitation learning can succeed in deriving policies that, if adopted by 5% of vehicles, may boost the energy-efficiency of networks with varying traffic conditions by 15% using only local observations.
arXiv Detail & Related papers (2022-06-28T17:08:31Z) - Dynamic and Static Object Detection Considering Fusion Regions and
Point-wise Features [7.41540085468436]
This paper proposes a new approach to detect static and dynamic objects in front of an autonomous vehicle.
Our approach can also get other characteristics from the objects detected, like their position, velocity, and heading.
To demonstrate our proposal's performance, we asses it through a benchmark dataset and real-world data obtained from an autonomous platform.
arXiv Detail & Related papers (2021-07-27T09:42:18Z) - Transferable Deep Reinforcement Learning Framework for Autonomous
Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions.
We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV.
We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.