Related papers: Interpretable Dynamic Graph Neural Networks for Small Occluded Object Detection and Tracking

Interpretable Dynamic Graph Neural Networks for Small Occluded Object Detection and Tracking

URL: http://arxiv.org/abs/2411.17251v7
Date: Mon, 28 Apr 2025 15:19:14 GMT
Title: Interpretable Dynamic Graph Neural Networks for Small Occluded Object Detection and Tracking
Authors: Shahriar Soudeep, Md Abrar Jahin, M. F. Mridha,
Abstract summary: This paper introduces DGNN-YOLO, a novel framework that integrates dynamic graph neural networks (DGNNs) with YOLO11 to address limitations.<n>Unlike standard GNNs, DGNNs are chosen for their superior ability to dynamically update graph structures in real-time.<n>This framework constructs and regularly updates its graph representations, capturing objects as nodes and their interactions as edges.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The detection and tracking of small, occluded objects such as pedestrians, cyclists, and motorbikes pose significant challenges for traffic surveillance systems because of their erratic movement, frequent occlusion, and poor visibility in dynamic urban environments. Traditional methods like YOLO11, while proficient in spatial feature extraction for precise detection, often struggle with these small and dynamically moving objects, particularly in handling real-time data updates and resource efficiency. This paper introduces DGNN-YOLO, a novel framework that integrates dynamic graph neural networks (DGNNs) with YOLO11 to address these limitations. Unlike standard GNNs, DGNNs are chosen for their superior ability to dynamically update graph structures in real-time, which enables adaptive and robust tracking of objects in highly variable urban traffic scenarios. This framework constructs and regularly updates its graph representations, capturing objects as nodes and their interactions as edges, thus effectively responding to rapidly changing conditions. Additionally, DGNN-YOLO incorporates Grad-CAM, Grad-CAM++, and Eigen-CAM visualization techniques to enhance interpretability and foster trust, offering insights into the model's decision-making process. Extensive experiments validate the framework's performance, achieving a precision of 0.8382, recall of 0.6875, and mAP@0.5:0.95 of 0.6476, significantly outperforming existing methods. This study offers a scalable and interpretable solution for real-time traffic surveillance and significantly advances intelligent transportation systems' capabilities by addressing the critical challenge of detecting and tracking small, occluded objects.

Related papers

GazeSCRNN: Event-based Near-eye Gaze Tracking using a Spiking Neural Network [0.0]
This work introduces GazeSCRNN, a novel convolutional recurrent neural network designed for event-based near-eye gaze tracking. Model processes event streams from DVS cameras using Adaptive Leaky-Integrate-and-Fire (ALIF) neurons and a hybrid architecture for-temporal data. The most accurate model achieved a Mean Angle Error (MAE) of 6.034degdeg and a Mean Pupil Error (MPE) of 2.094 mm.
arXiv Detail & Related papers (2025-03-20T10:32:15Z)
Virtual Nodes Improve Long-term Traffic Prediction [9.125554921271338]
This study introduces a novel framework that incorporates virtual nodes, which are additional nodes added to the graph and connected to existing nodes. Our proposed model incorporates virtual nodes by constructing a semi-adaptive adjacency matrix. Experimental results demonstrate that the inclusion of virtual nodes significantly enhances long-term prediction accuracy.
arXiv Detail & Related papers (2025-01-17T09:09:01Z)
CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics [7.696109414724968]
Spiking neural networks (SNNs) are promising for event-based object recognition and detection. Existing SNN frameworks often fail to handle multi-scaletemporal features, leading to increased data redundancy and reduced accuracy. We propose CREST, a novel conjointly-trained spike-driven framework to exploit event-based object detection.
arXiv Detail & Related papers (2024-12-17T04:33:31Z)
Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z)
Deep Learning and Hybrid Approaches for Dynamic Scene Analysis, Object Detection and Motion Tracking [0.0]
This project aims to develop a robust video surveillance system, which can segment videos into smaller clips based on the detection of activities. It uses CCTV footage, for example, to record only major events-like the appearance of a person or a thief-so that storage is optimized and digital searches are easier.
arXiv Detail & Related papers (2024-12-05T07:44:40Z)
3D Multi-Object Tracking with Semi-Supervised GRU-Kalman Filter [6.13623925528906]
3D Multi-Object Tracking (MOT) is essential for intelligent systems like autonomous driving and robotic sensing. We propose a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module. This approach is able to learn object motion characteristics through data-driven learning, thereby avoiding the need for manual model design and model error.
arXiv Detail & Related papers (2024-11-13T08:34:07Z)
Improving Traffic Flow Predictions with SGCN-LSTM: A Hybrid Model for Spatial and Temporal Dependencies [55.2480439325792]
This paper introduces the Signal-Enhanced Graph Convolutional Network Long Short Term Memory (SGCN-LSTM) model for predicting traffic speeds across road networks. Experiments on the PEMS-BAY road network traffic dataset demonstrate the SGCN-LSTM model's effectiveness.
arXiv Detail & Related papers (2024-11-01T00:37:00Z)
Biologically Inspired Swarm Dynamic Target Tracking and Obstacle Avoidance [0.0]
This study proposes a novel artificial intelligence (AI) driven flight computer to track dynamic targets using a distributed drone swarm for military applications. The controller integrates a fuzzy interface, a neural network enabling rapid adaption, predictive capability and multi-agent solving.
arXiv Detail & Related papers (2024-10-15T03:47:09Z)
DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios [2.615648035076649]
We propose a novel real-time object detector, DS MYOLO, inspired by Mamba's outstanding performance. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Experiments on the CCTSDB 2021 and VLD-45 driving scenarios demonstrate that DS MYOLO exhibits significant potential and competitive advantage.
arXiv Detail & Related papers (2024-09-02T09:22:33Z)
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving. Existing approaches in this field often assume precise and complete observational data. We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z)
PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving. As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency. In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z)
Elastic Interaction Energy-Informed Real-Time Traffic Scene Perception [8.429178814528617]
A topology-aware energy loss function-based network training strategy named EIEGSeg is proposed. EIEGSeg is designed for multi-class segmentation on real-time traffic scene perception. Our results demonstrate that EIEGSeg consistently improves the performance, especially on real-time, lightweight networks.
arXiv Detail & Related papers (2023-10-02T01:30:42Z)
MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor. Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z)
PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction [78.05103666987655]
spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem. We propose a novel propagation delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction. Our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency.
arXiv Detail & Related papers (2023-01-19T08:42:40Z)
Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions [65.84090965167535]
We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network. This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.
arXiv Detail & Related papers (2022-06-29T18:47:05Z)
Efficient Federated Learning with Spike Neural Networks for Traffic Sign Recognition [70.306089187104]
We introduce powerful Spike Neural Networks (SNNs) into traffic sign recognition for energy-efficient and fast model training. Numerical results indicate that the proposed federated SNN outperforms traditional federated convolutional neural networks in terms of accuracy, noise immunity, and energy efficiency as well.
arXiv Detail & Related papers (2022-05-28T03:11:48Z)
Spatio-Temporal Look-Ahead Trajectory Prediction using Memory Neural Network [6.065344547161387]
This paper attempts to solve the problem of Spatio-temporal look-ahead trajectory prediction using a novel recurrent neural network called the Memory Neuron Network. The proposed model is computationally less intensive and has a simple architecture as compared to other deep learning models that utilize LSTMs and GRUs.
arXiv Detail & Related papers (2021-02-24T05:02:19Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.