Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring
- URL: http://arxiv.org/abs/2502.11304v1
- Date: Sun, 16 Feb 2025 23:03:26 GMT
- Title: Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring
- Authors: Murat Arda Onsu, Poonam Lohan, Burak Kantarci, Aisha Syed, Matthew Andrews, Sean Kennedy,
- Abstract summary: This research leverages the LLaVA visual grounding multimodal large language model (LLM) for traffic monitoring tasks on the real-time Quanser Interactive Lab simulation platform.
Cameras placed at multiple urban locations collect real-time images from the simulation, which are fed into the LLaVA model with queries for analysis.
The system achieves 84.3% accuracy in recognizing vehicle locations and 76.4% in determining steering direction, outperforming traditional models.
- Score: 6.648291808015463
- License:
- Abstract: A robust and efficient traffic monitoring system is essential for smart cities and Intelligent Transportation Systems (ITS), using sensors and cameras to track vehicle movements, optimize traffic flow, reduce congestion, enhance road safety, and enable real-time adaptive traffic control. Traffic monitoring models must comprehensively understand dynamic urban conditions and provide an intuitive user interface for effective management. This research leverages the LLaVA visual grounding multimodal large language model (LLM) for traffic monitoring tasks on the real-time Quanser Interactive Lab simulation platform, covering scenarios like intersections, congestion, and collisions. Cameras placed at multiple urban locations collect real-time images from the simulation, which are fed into the LLaVA model with queries for analysis. An instance segmentation model integrated into the cameras highlights key elements such as vehicles and pedestrians, enhancing training and throughput. The system achieves 84.3% accuracy in recognizing vehicle locations and 76.4% in determining steering direction, outperforming traditional models.
Related papers
- AIoT-based smart traffic management system [0.0]
This paper presents a novel AI-based smart traffic management system de-signed to optimize traffic flow and reduce congestion in urban environments.
By analysing live footage from existing CCTV cameras, this approach eliminates the need for additional hardware.
The AI model processes live video feeds to accurately count vehicles and assess traffic density, allowing for adaptive signal control.
arXiv Detail & Related papers (2025-02-04T11:38:42Z) - Traffic Co-Simulation Framework Empowered by Infrastructure Camera Sensing and Reinforcement Learning [4.336971448707467]
Multi-agent reinforcement learning (MARL) is particularly effective for learning control strategies for traffic lights in a network using iterative simulations.
This study proposes a co-simulation framework integrating CARLA and SUMO, which combines high-fidelity 3D modeling with large-scale traffic flow simulation.
Experiments in the test-bed demonstrate the effectiveness of the proposed MARL approach in enhancing traffic conditions using real-time camera-based detection.
arXiv Detail & Related papers (2024-12-05T07:01:56Z) - Traffic control using intelligent timing of traffic lights with reinforcement learning technique and real-time processing of surveillance camera images [0.0]
The optimal timing of traffic lights is determined and applied according to several parameters.
Deep learning methods were used in vehicle detection using the YOLOv9-C model.
The use of transfer learning along with retraining the model on images of Iranian cars has increased the accuracy of the model.
arXiv Detail & Related papers (2024-05-22T00:04:32Z) - A Holistic Framework Towards Vision-based Traffic Signal Control with
Microscopic Simulation [53.39174966020085]
Traffic signal control (TSC) is crucial for reducing traffic congestion that leads to smoother traffic flow, reduced idling time, and mitigated CO2 emissions.
In this study, we explore the computer vision approach for TSC that modulates on-road traffic flows through visual observation.
We introduce a holistic traffic simulation framework called TrafficDojo towards vision-based TSC and its benchmarking.
arXiv Detail & Related papers (2024-03-11T16:42:29Z) - Real-Time Vehicle Detection and Urban Traffic Behavior Analysis Based on
UAV Traffic Videos on Mobile Devices [14.30857727025523]
This paper integrates drone technology, iOS development, and deep learning techniques to integrate traffic video acquisition, object detection, object tracking, and traffic behavior analysis functions on mobile devices.
The vehicle object detection can reach 98.27% precision rate and 87.93% recall rate, and the real-time processing capacity is stable at 30 frames per seconds.
arXiv Detail & Related papers (2024-02-26T02:09:36Z) - TrafficBots: Towards World Models for Autonomous Driving Simulation and
Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model.
We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving.
Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z) - Tackling Real-World Autonomous Driving using Deep Reinforcement Learning [63.3756530844707]
In this work, we propose a model-free Deep Reinforcement Learning Planner training a neural network that predicts acceleration and steering angle.
In order to deploy the system on board the real self-driving car, we also develop a module represented by a tiny neural network.
arXiv Detail & Related papers (2022-07-05T16:33:20Z) - Traffic-Net: 3D Traffic Monitoring Using a Single Camera [1.1602089225841632]
We provide a practical platform for real-time traffic monitoring using a single CCTV traffic camera.
We adapt a custom YOLOv5 deep neural network model for vehicle/pedestrian detection and an enhanced SORT tracking algorithm.
We also develop a hierarchical traffic modelling solution based on short- and long-term temporal video data stream.
arXiv Detail & Related papers (2021-09-19T16:59:01Z) - End-to-End Intersection Handling using Multi-Agent Deep Reinforcement
Learning [63.56464608571663]
Navigating through intersections is one of the main challenging tasks for an autonomous vehicle.
In this work, we focus on the implementation of a system able to navigate through intersections where only traffic signs are provided.
We propose a multi-agent system using a continuous, model-free Deep Reinforcement Learning algorithm used to train a neural network for predicting both the acceleration and the steering angle at each time step.
arXiv Detail & Related papers (2021-04-28T07:54:40Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.