Don't let the information slip away
- URL: http://arxiv.org/abs/2602.22595v2
- Date: Fri, 27 Feb 2026 16:50:03 GMT
- Title: Don't let the information slip away
- Authors: Taozhe Li, Guansu Wang, Bo Yu, Yiming Liu, Wei Sun,
- Abstract summary: YOLO series of detectors is among the most well-known CNN-based object detection models.<n> transformer-based object detection models have demonstrated impressive performance.<n>We propose an object detection model called Association DETR, which achieves state-of-the-art results.
- Score: 8.192654039359041
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time object detection has advanced rapidly in recent years. The YOLO series of detectors is among the most well-known CNN-based object detection models and cannot be overlooked. The latest version, YOLOv26, was recently released, while YOLOv12 achieved state-of-the-art (SOTA) performance with 55.2 mAP on the COCO val2017 dataset. Meanwhile, transformer-based object detection models, also known as DEtection TRansformer (DETR), have demonstrated impressive performance. RT-DETR is an outstanding model that outperformed the YOLO series in both speed and accuracy when it was released. Its successor, RT-DETRv2, achieved 53.4 mAP on the COCO val2017 dataset. However, despite their remarkable performance, all these models let information to slip away. They primarily focus on the features of foreground objects while neglecting the contextual information provided by the background. We believe that background information can significantly aid object detection tasks. For example, cars are more likely to appear on roads rather than in offices, while wild animals are more likely to be found in forests or remote areas rather than on busy streets. To address this gap, we propose an object detection model called Association DETR, which achieves state-of-the-art results compared to other object detection models on the COCO val2017 dataset.
Related papers
- Replication Study and Benchmarking of Real-Time Object Detection Models [0.4499833362998488]
We compare a variety of object detection models' accuracy and inference speed on multiple graphics cards.<n>We propose a unified training and evaluation pipeline, based on MMDetection's features, to better compare models.<n>Results exhibit a strong trade-off between accuracy and speed, prevailed by anchor-free models.
arXiv Detail & Related papers (2024-05-11T04:47:50Z) - YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 [19.388112026410045]
YOLO-TLA is an advanced object detection model building on YOLOv5.
We first introduce an additional detection layer for small objects in the neck network pyramid architecture.
This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
arXiv Detail & Related papers (2024-02-22T05:55:17Z) - From Blurry to Brilliant Detection: YOLO-Based Aerial Object Detection with Super Resolution [3.5044007821404635]
Aerial object detection presents challenges from small object sizes, high density clustering, and image quality degradation from distance and motion blur.<n>B2BDet addresses this with a two-stage framework that applies domain-specific super-resolution during inference, followed by detection using an enhanced YOLOv5 architecture.<n>The approach combines aerial-optimized SRGAN fine-tuning with architectural innovations including an Efficient Attention Module (EAM) and Cross-Layer Feature Pyramid Network (CLFPN)
arXiv Detail & Related papers (2024-01-26T05:50:58Z) - Investigating YOLO Models Towards Outdoor Obstacle Detection For
Visually Impaired People [3.4628430044380973]
Seven different YOLO object detection models were implemented.
YOLOv8 was found to be the best model, which reached a precision of $80%$ and a recall of $68.2%$ on a well-known Obstacle dataset.
YOLO-NAS was found to be suboptimal for the obstacle detection task.
arXiv Detail & Related papers (2023-12-10T13:16:22Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [63.36722419180875]
We provide an efficient and performant object detector, termed YOLO-MS.<n>We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.<n>Our work can also serve as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Evaluation of YOLO Models with Sliced Inference for Small Object
Detection [0.0]
This work aims to benchmark the YOLOv5 and YOLOX models for small object detection.
The effects of sliced fine-tuning and sliced inference combined produced substantial improvement for all models.
arXiv Detail & Related papers (2022-03-09T15:24:30Z) - YOLO-Z: Improving small object detection in YOLOv5 for autonomous
vehicles [5.765622319599904]
This study explores how the popular YOLOv5 object detector can be modified to improve its performance in detecting smaller objects.
We propose a series of models at different scales, which we name YOLO-Z', and which display an improvement of up to 6.9% in mAP when detecting smaller objects at 50% IOU.
Our objective is to inform future research on the potential of adjusting a popular detector such as YOLOv5 to address specific tasks.
arXiv Detail & Related papers (2021-12-22T11:03:43Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.