YOLO-IOD: Towards Real Time Incremental Object Detection
- URL: http://arxiv.org/abs/2512.22973v2
- Date: Thu, 01 Jan 2026 03:38:53 GMT
- Title: YOLO-IOD: Towards Real Time Incremental Object Detection
- Authors: Shizhou Zhang, Xueqiang Lv, Yinghui Xing, Qirui Wu, Di Xu, Chen Zhao, Yanning Zhang,
- Abstract summary: We introduce YOLO-IOD, a real-time Incremental Object Detection (IOD) framework that is constructed upon the pretrained YOLO-World model.<n>YOLO-IOD encompasses three principal components: 1) Conflict-Aware Pseudo-Label Refinement (CPR), which mitigates the foreground-background confusion.<n>We also introduce Cross-Stage Asymmetric Knowledge Distillation (CAKD), which addresses the misaligned knowledge distillation conflict.
- Score: 57.862742461237055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current methods for incremental object detection (IOD) primarily rely on Faster R-CNN or DETR series detectors; however, these approaches do not accommodate the real-time YOLO detection frameworks. In this paper, we first identify three primary types of knowledge conflicts that contribute to catastrophic forgetting in YOLO-based incremental detectors: foreground-background confusion, parameter interference, and misaligned knowledge distillation. Subsequently, we introduce YOLO-IOD, a real-time Incremental Object Detection (IOD) framework that is constructed upon the pretrained YOLO-World model, facilitating incremental learning via a stage-wise parameter-efficient fine-tuning process. Specifically, YOLO-IOD encompasses three principal components: 1) Conflict-Aware Pseudo-Label Refinement (CPR), which mitigates the foreground-background confusion by leveraging the confidence levels of pseudo labels and identifying potential objects relevant to future tasks. 2) Importancebased Kernel Selection (IKS), which identifies and updates the pivotal convolution kernels pertinent to the current task during the current learning stage. 3) Cross-Stage Asymmetric Knowledge Distillation (CAKD), which addresses the misaligned knowledge distillation conflict by transmitting the features of the student target detector through the detection heads of both the previous and current teacher detectors, thereby facilitating asymmetric distillation between existing and newly introduced categories. We further introduce LoCo COCO, a more realistic benchmark that eliminates data leakage across stages. Experiments on both conventional and LoCo COCO benchmarks show that YOLO-IOD achieves superior performance with minimal forgetting.
Related papers
- YOLOA: Real-Time Affordance Detection via LLM Adapter [96.61111291833544]
Affordance detection aims to jointly address the fundamental "what-where-how" challenge in embodied AI.<n>We introduce YOLO Affordance (YOLOA), a real-time affordance detection model that jointly handles object detection and affordance learning.<n>Experiments on our relabeled ADG-Det and IIT-Heat benchmarks demonstrate that YOLOA achieves state-of-the-art accuracy while maintaining real-time performance.
arXiv Detail & Related papers (2025-12-03T03:53:31Z) - DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic [12.91797400491484]
Real-world object detection systems, such as those in autonomous driving and surveillance, must continuously learn new object categories.<n>Existing approaches, Class Incremental Object Detection (CIOD) and Domain Incremental Object Detection (DIOD) only address one aspect of this challenge.<n>We propose Dual Incremental Object Detection (DuIOD), a more practical setting that simultaneously handles class and domain shifts in an exemplar-free manner.
arXiv Detail & Related papers (2025-06-26T13:41:47Z) - Revisiting Out-of-Distribution Detection in Real-time Object Detection: From Benchmark Pitfalls to a New Mitigation Paradigm [8.206992765692535]
Out-of-distribution (OoD) inputs pose a persistent challenge to deep learning models.<n>This work addresses two overlooked dimensions of OoD detection in object detection.<n>We introduce a novel training-time mitigation paradigm that operates independently of external OoD detectors.
arXiv Detail & Related papers (2025-03-10T13:42:41Z) - Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection [5.6148728159802035]
Real-time object detectors like YOLO achieve exceptional performance when trained on large datasets for multiple epochs.<n>In real-world scenarios where data arrives incrementally, neural networks suffer from catastrophic forgetting.<n>We introduce YOLO LwF, a self-distillation approach tailored for YOLO-based continual object detection.
arXiv Detail & Related papers (2025-03-06T18:31:41Z) - 3A-YOLO: New Real-Time Object Detectors with Triple Discriminative Awareness and Coordinated Representations [0.0]
This work aims to leverage multiple attention mechanisms to hierarchically enhance the triple discriminative awareness of the YOLO detection head.<n>We first propose a new head denoted TDA-YOLO Module, which unifiedly enhance the representations learning of scale-awareness, spatial-awareness, and task-awareness.<n> Secondly, we steer the intermediate features to coordinately learn the inter-channel relationships and precise positional information.
arXiv Detail & Related papers (2024-12-10T04:01:32Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Towards End-to-end Semi-supervised Learning for One-stage Object
Detection [88.56917845580594]
This paper focuses on the semi-supervised learning for the advanced and popular one-stage detection network YOLOv5.
We propose a novel teacher-student learning recipe called OneTeacher with two innovative designs, namely Multi-view Pseudo-label Refinement (MPR) and Decoupled Semi-supervised Optimization (DSO)
In particular, MPR improves the quality of pseudo-labels via augmented-view refinement and global-view filtering, and DSO handles the joint optimization conflicts via structure tweaks and task-specific pseudo-labeling.
arXiv Detail & Related papers (2023-02-22T11:35:40Z) - Efficient Teacher: Semi-Supervised Object Detection for YOLOv5 [2.2290171169275492]
One-stage anchor-based detectors lack the structure to generate high-quality or flexible pseudo labels.
Dense Detector is a baseline model that extends RetinaNet with dense sampling techniques inspired by YOLOv5.
Pseudo Label Assigner makes more refined use of pseudo labels from Dense Detector.
Epoch Adaptor is a method that enables a stable and efficient end-to-end semi-supervised training schedule.
arXiv Detail & Related papers (2023-02-15T10:40:19Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.