Evaluating Vision-Language Models for Zero-Shot Detection, Classification, and Association of Motorcycles, Passengers, and Helmets
- URL: http://arxiv.org/abs/2408.02244v1
- Date: Mon, 5 Aug 2024 05:30:36 GMT
- Title: Evaluating Vision-Language Models for Zero-Shot Detection, Classification, and Association of Motorcycles, Passengers, and Helmets
- Authors: Lucas Choi, Ross Greer,
- Abstract summary: This study evaluates the efficacy of an advanced vision-language foundation model, OWLv2, in detecting and classifying various helmet-wearing statuses of motorcycle occupants using video data.
We employ a cascaded model approach for detection and classification tasks, integrating OWLv2 and CNN models.
The results highlight the potential of zero-shot learning to address challenges arising from incomplete and biased training datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motorcycle accidents pose significant risks, particularly when riders and passengers do not wear helmets. This study evaluates the efficacy of an advanced vision-language foundation model, OWLv2, in detecting and classifying various helmet-wearing statuses of motorcycle occupants using video data. We extend the dataset provided by the CVPR AI City Challenge and employ a cascaded model approach for detection and classification tasks, integrating OWLv2 and CNN models. The results highlight the potential of zero-shot learning to address challenges arising from incomplete and biased training datasets, demonstrating the usage of such models in detecting motorcycles, helmet usage, and occupant positions under varied conditions. We have achieved an average precision of 0.5324 for helmet detection and provided precision-recall curves detailing the detection and classification performance. Despite limitations such as low-resolution data and poor visibility, our research shows promising advancements in automated vehicle safety and traffic safety enforcement systems.
Related papers
- Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning [13.613407983544427]
We introduce a robust model designed to withstand changes in camera position within the vehicle.
Our Driver Behavior Monitoring Network (DBMNet) relies on a lightweight backbone and integrates a disentanglement module.
Experiments conducted on the daytime and nighttime subsets of the 100-Driver dataset validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-11-20T10:27:12Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - On using Machine Learning Algorithms for Motorcycle Collision Detection [0.0]
Impact simulations show that the risk of severe injury or death in the event of a motorcycle-to-car impact can be greatly reduced if the motorcycle is equipped with passive safety measures such as airbags and seat belts.
For the challenge of reliably detecting impending collisions, this paper presents an investigation towards the applicability of machine learning algorithms.
arXiv Detail & Related papers (2024-03-14T15:32:25Z) - DRUformer: Enhancing the driving scene Important object detection with
driving relationship self-understanding [50.81809690183755]
Traffic accidents frequently lead to fatal injuries, contributing to over 50 million deaths until 2023.
Previous research primarily assessed the importance of individual participants, treating them as independent entities.
We introduce Driving scene Relationship self-Understanding transformer (DRUformer) to enhance the important object detection task.
arXiv Detail & Related papers (2023-11-11T07:26:47Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Real-Time Helmet Violation Detection Using YOLOv5 and Ensemble Learning [4.397520291340696]
This paper presents the development and evaluation of a real-time YOLOv5 Deep Learning (DL) model for detecting riders and passengers on motorbikes.
We trained the model on 100 videos recorded at 10 fps, each for 20 seconds.
The proposed model was tested on 100 test videos and produced an mAP score of 0.5267, ranking 11th on the AI City Track 5 public leaderboard.
arXiv Detail & Related papers (2023-04-14T14:15:56Z) - Real-Time Helmet Violation Detection in AI City Challenge 2023 with
Genetic Algorithm-Enhanced YOLOv5 [6.081363026350582]
This research focuses on real-time surveillance systems as a means for tackling the issue of non-compliance with helmet regulations.
Previous attempts at real-time helmet violation detection have been hindered by their limited ability to operate in real-time.
This paper introduces a novel real-time helmet violation detection system that utilizes the YOLOv5 single-stage object detection model.
arXiv Detail & Related papers (2023-04-13T22:04:30Z) - Real-time Multi-Class Helmet Violation Detection Using Few-Shot Data
Sampling Technique and YOLOv8 [11.116729994007686]
This study proposes a robust real-time helmet violation detection system.
Our proposed method won 7th place in the 2023 AI City Challenge, Track 5, with an mAP score of 0.5861.
arXiv Detail & Related papers (2023-04-13T21:13:55Z) - DeepAccident: A Motion and Accident Prediction Benchmark for V2X
Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving.
The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z) - Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts,
Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles.
Concepts and characteristics related to both sensors, as well as to their fusion, are presented.
We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z) - A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents.
We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.