Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection
- URL: http://arxiv.org/abs/2507.13085v1
- Date: Thu, 17 Jul 2025 12:56:04 GMT
- Title: Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection
- Authors: Riku Inoue, Masamitsu Tsuchiya, Yuji Yasui,
- Abstract summary: Open World Object Detection is a challenging computer vision task.<n>Many methods have addressed this by using pseudo-labels for unknown objects.<n>The recently proposed Probabilistic Objectness transformer-based open-world detector (PROB) is a state-of-the-art model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open World Object Detection (OWOD) is a challenging computer vision task that extends standard object detection by (1) detecting and classifying unknown objects without supervision, and (2) incrementally learning new object classes without forgetting previously learned ones. The absence of ground truths for unknown objects makes OWOD tasks particularly challenging. Many methods have addressed this by using pseudo-labels for unknown objects. The recently proposed Probabilistic Objectness transformer-based open-world detector (PROB) is a state-of-the-art model that does not require pseudo-labels for unknown objects, as it predicts probabilistic objectness. However, this method faces issues with learning conflicts between objectness and class predictions. To address this issue and further enhance performance, we propose a novel model, Decoupled PROB. Decoupled PROB introduces Early Termination of Objectness Prediction (ETOP) to stop objectness predictions at appropriate layers in the decoder, resolving the learning conflicts between class and objectness predictions in PROB. Additionally, we introduce Task-Decoupled Query Initialization (TDQI), which efficiently extracts features of known and unknown objects, thereby improving performance. TDQI is a query initialization method that combines query selection and learnable queries, and it is a module that can be easily integrated into existing DETR-based OWOD models. Extensive experiments on OWOD benchmarks demonstrate that Decoupled PROB surpasses all existing methods across several metrics, significantly improving performance.
Related papers
- Dynamic Object Queries for Transformer-based Incremental Object Detection [45.41291377837515]
Incremental object detection aims to sequentially learn new classes, while maintaining the capability to locate and identify old ones.
Prior methodologies mainly tackle the forgetting issue through knowledge distillation and exemplar replay.
We propose DyQ-DETR, which incrementally expands the model representation ability to achieve stability-plasticity tradeoffs.
arXiv Detail & Related papers (2024-07-31T15:29:34Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and
Segment Anything Model [14.080744645704751]
Open World Object Detection (OWOD) is a novel and challenging computer vision task.
We propose a simple yet effective learning strategy, namely Decoupled Objectness Learning (DOL), which divides the learning of these two boundaries into decoder layers.
We also introduce an Auxiliary Supervision Framework (ASF) that uses a pseudo-labeling and a soft-weighting strategies to alleviate the negative impact of noise.
arXiv Detail & Related papers (2023-06-04T06:42:09Z) - Cycle Consistency Driven Object Discovery [75.60399804639403]
We introduce a method that explicitly optimize the constraint that each object in a scene should be associated with a distinct slot.
By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance.
Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.
arXiv Detail & Related papers (2023-06-03T21:49:06Z) - CAT: LoCalization and IdentificAtion Cascade Detection Transformer for
Open-World Object Detection [17.766859354014663]
Open-world object detection requires a model trained from data on known objects to detect both known and unknown objects.
We propose a novel solution called CAT: LoCalization and IdentificAtion Cascade Detection Transformer.
We show that our model outperforms the state-of-the-art in terms of all metrics in the task of OWOD, incremental object detection (IOD) and open-set detection.
arXiv Detail & Related papers (2023-01-05T09:11:16Z) - Open World DETR: Transformer based Open World Object Detection [60.64535309016623]
We propose a two-stage training approach named Open World DETR for open world object detection based on Deformable DETR.
We fine-tune the class-specific components of the model with a multi-view self-labeling strategy and a consistency constraint.
Our proposed method outperforms other state-of-the-art open world object detection methods by a large margin.
arXiv Detail & Related papers (2022-12-06T13:39:30Z) - PROB: Probabilistic Objectness for Open World Object Detection [15.574535196804042]
Open World Object Detection (OWOD) is a new computer vision task that bridges the gap between classic object detection (OD) benchmarks and object detection in the real world.
We introduce a novel probabilistic framework for objectness estimation, where we alternate between probability distribution estimation and objectness likelihood of known objects.
The resulting Probabilistic Objectness transformer-based open-world detector, PROB, integrates our framework into traditional object detection models.
arXiv Detail & Related papers (2022-12-02T20:04:24Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.