DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios
- URL: http://arxiv.org/abs/2409.01093v1
- Date: Mon, 2 Sep 2024 09:22:33 GMT
- Title: DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios
- Authors: Yang Li, Jianli Xiao,
- Abstract summary: We propose a novel real-time object detector, DS MYOLO, inspired by Mamba's outstanding performance.
This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features.
Experiments on the CCTSDB 2021 and VLD-45 driving scenarios demonstrate that DS MYOLO exhibits significant potential and competitive advantage.
- Score: 2.615648035076649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios. With the rapid development of deep learning technology, CNN-based YOLO real-time object detectors have gained significant attention. However, the local focus of CNNs results in performance bottlenecks. To further enhance detector performance, researchers have introduced Transformer-based self-attention mechanisms to leverage global receptive fields, but their quadratic complexity incurs substantial computational costs. Recently, Mamba, with its linear complexity, has made significant progress through global selective scanning. Inspired by Mamba's outstanding performance, we propose a novel object detector: DS MYOLO. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Additionally, we introduce an efficient channel attention convolution (ECAConv) that enhances cross-channel feature interaction while maintaining low computational complexity. Extensive experiments on the CCTSDB 2021 and VLD-45 driving scenarios datasets demonstrate that DS MYOLO exhibits significant potential and competitive advantage among similarly scaled YOLO series real-time object detectors.
Related papers
- How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? [15.550663626482903]
We investigate the efficacy of data augmentations to close the domain gap in spaceborne computer vision.
We propose two novel data augmentations specifically developed to emulate the visual effects observed in orbital imagery.
arXiv Detail & Related papers (2024-10-21T08:24:46Z) - EMDFNet: Efficient Multi-scale and Diverse Feature Network for Traffic Sign Detection [11.525603303355268]
The detection of small objects, particularly traffic signs, is a critical subtask within object detection and autonomous driving.
Motivated by these challenges, we propose a novel object detection network named Efficient Multi-scale and Diverse Feature Network (EMDFNet)
EMDFNet integrates an Augmented Shortcut Module and an Efficient Hybrid to address the aforementioned issues simultaneously.
arXiv Detail & Related papers (2024-08-26T11:26:27Z) - Deep Learning-Based Robust Multi-Object Tracking via Fusion of mmWave Radar and Camera Sensors [6.166992288822812]
Multi-Object Tracking plays a critical role in ensuring safer and more efficient navigation through complex traffic scenarios.
This paper presents a novel deep learning-based method that integrates radar and camera data to enhance the accuracy and robustness of Multi-Object Tracking in autonomous driving systems.
arXiv Detail & Related papers (2024-07-10T21:09:09Z) - FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles [5.803236995616553]
Federated learning is a promising solution to train sophisticated machine learning models in vehicular networks.
We introduce FedPylot, a lightweight MPI-based prototype to simulate federated object detection experiments.
Our study factors in accuracy, communication cost, and inference speed, thereby presenting a balanced approach to the challenges faced by autonomous vehicles.
arXiv Detail & Related papers (2024-06-05T20:06:59Z) - Cross-Cluster Shifting for Efficient and Effective 3D Object Detection
in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving.
We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector.
We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z) - ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and
Multispectral Data Fusion [54.668445421149364]
Deep learning-based hyperspectral image (HSI) super-resolution aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs)
In this letter, we propose a novel adversarial automatic data augmentation framework ADASR that automatically optimize and augments HSI-MSI sample pairs to enrich data diversity for HSI-MSI fusion.
arXiv Detail & Related papers (2023-10-11T07:30:37Z) - Salient Object Detection in Optical Remote Sensing Images Driven by
Transformer [69.22039680783124]
We propose a novel Global Extraction Local Exploration Network (GeleNet) for Optical Remote Sensing Images (ORSI-SOD)
Specifically, GeleNet first adopts a transformer backbone to generate four-level feature embeddings with global long-range dependencies.
Extensive experiments on three public datasets demonstrate that the proposed GeleNet outperforms relevant state-of-the-art methods.
arXiv Detail & Related papers (2023-09-15T07:14:43Z) - Performance Study of YOLOv5 and Faster R-CNN for Autonomous Navigation
around Non-Cooperative Targets [0.0]
This paper discusses how the combination of cameras and machine learning algorithms can achieve the relative navigation task.
The performance of two deep learning-based object detection algorithms, Faster Region-based Convolutional Neural Networks (R-CNN) and You Only Look Once (YOLOv5) is tested.
The paper discusses the path to implementing the feature recognition algorithms and towards integrating them into the spacecraft Guidance Navigation and Control system.
arXiv Detail & Related papers (2023-01-22T04:53:38Z) - Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device [53.323878851563414]
We propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques.
Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically.
The proposed framework achieves real-time 3D object detection on mobile devices with competitive detection performance.
arXiv Detail & Related papers (2020-12-26T19:41:15Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative.
DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances.
Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.