Related papers: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework

You Only Train Once (YOTO): A Retraining-Free Object Detection Framework

URL: http://arxiv.org/abs/2512.04888v2
Date: Fri, 05 Dec 2025 08:01:06 GMT
Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework
Authors: Priyanto Hidayatullah, Nurjannah Syakrani, Yudi Widhiyasana, Muhammad Rizqi Sholahuddin, Refdinal Tubagus, Zahri Al Adzani Hidayat, Hanri Fajar Ramadhan, Dafa Alfarizki Pratama, Farhan Muhammad Yasin,
Abstract summary: This study introduces You Only Train Once (YOTO), a methodology designed to address the issue of catastrophic forgetting.<n>For classification, we utilize cosine similarity between the embedding features of the target product and those in the Qdrant vector database.<n>We achieve almost 3 times the training time efficiency compared to classical object detection approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Object detection constitutes the primary task within the domain of computer vision. It is utilized in numerous domains. Nonetheless, object detection continues to encounter the issue of catastrophic forgetting. The model must be retrained whenever new products are introduced, utilizing not only the new products dataset but also the entirety of the previous dataset. The outcome is obvious: increasing model training expenses and significant time consumption. In numerous sectors, particularly retail checkout, the frequent introduction of new products presents a great challenge. This study introduces You Only Train Once (YOTO), a methodology designed to address the issue of catastrophic forgetting by integrating YOLO11n for object localization with DeIT and Proxy Anchor Loss for feature extraction and metric learning. For classification, we utilize cosine similarity between the embedding features of the target product and those in the Qdrant vector database. In a case study conducted in a retail store with 140 products, the experimental results demonstrate that our proposed framework achieves encouraging accuracy, whether for detecting new or existing products. Furthermore, without retraining, the training duration difference is significant. We achieve almost 3 times the training time efficiency compared to classical object detection approaches. This efficiency escalates as additional new products are added to the product database. The average inference time is 580 ms per image containing multiple products, on an edge device, validating the proposed framework's feasibility for practical use.

Related papers

Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection [8.977792536037956]
In everyday indoor navigation, robots often needto detect non-distinctive small-change objects. Existing techniques rely on high-quality class-specific object priors to regularize a change detector model. In this study, we explore the concept of degree-of-ill-posedness (DoI) to improve both passive and activevision.
arXiv Detail & Related papers (2024-05-10T01:56:39Z)
Proposal-Contrastive Pretraining for Object Detection from Fewer Data [11.416621957617334]
We present Proposal Selection Contrast (ProSeCo), a novel unsupervised overall pretraining approach. ProSeCo uses the large number of object proposals generated by the detector for contrastive learning. We show that our method outperforms state of the art in unsupervised pretraining for object detection on standard and novel benchmarks.
arXiv Detail & Related papers (2023-10-25T17:59:26Z)
Improved Region Proposal Network for Enhanced Few-Shot Object Detection [23.871860648919593]
Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches. We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during the FSOD training stage. Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception of the object detection model for large objects.
arXiv Detail & Related papers (2023-08-15T02:35:59Z)
Effective and Efficient Training for Sequential Recommendation using Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective. We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z)
Incremental-DETR: Incremental Few-Shot Object Detection via Self-Supervised Learning [60.64535309016623]
We propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector. To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision. We further introduce a incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without catastrophic forgetting.
arXiv Detail & Related papers (2022-05-09T05:08:08Z)
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training [44.32782190757813]
We construct a new large-scale benchmark termed BigDetection. Our dataset has 600 object categories and contains over 3.4M training images with 36M bounding boxes.
arXiv Detail & Related papers (2022-03-24T17:57:29Z)
Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object. This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z)
Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only. We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
Few-shot Weakly-Supervised Object Detection via Directional Statistics [55.97230224399744]
We propose a probabilistic multiple instance learning approach for few-shot Common Object Localization (COL) and few-shot Weakly Supervised Object Detection (WSOD) Our model simultaneously learns the distribution of the novel objects and localizes them via expectation-maximization steps. Our experiments show that the proposed method, despite being simple, outperforms strong baselines in few-shot COL and WSOD, as well as large-scale WSOD tasks.
arXiv Detail & Related papers (2021-03-25T22:34:16Z)
Closing the Generalization Gap in One-Shot Object Detection [92.82028853413516]
We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories. Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
arXiv Detail & Related papers (2020-11-09T09:31:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.