Related papers: DeLR: Active Learning for Detection with Decoupled Localization and Recognition Query

DeLR: Active Learning for Detection with Decoupled Localization and Recognition Query

URL: http://arxiv.org/abs/2312.16931v1
Date: Thu, 28 Dec 2023 09:58:32 GMT
Title: DeLR: Active Learning for Detection with Decoupled Localization and Recognition Query
Authors: Yuhang Zhang, Yuang Deng, Xiaopeng Zhang, Jie Li, Robert C. Qiu, Qi Tian
Abstract summary: In this paper, we rethink two key components, i.e., localization and recognition, for object detection. Motivated by this, we propose an efficient query strategy, called Decoupling the localization and recognition for active query.
Score: 53.54802901197267
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active learning has been demonstrated effective to reduce labeling cost, while most progress has been designed for image recognition, there still lacks instance-level active learning for object detection. In this paper, we rethink two key components, i.e., localization and recognition, for object detection, and find that the correctness of them are highly related, therefore, it is not necessary to annotate both boxes and classes if we are given pseudo annotations provided with the trained model. Motivated by this, we propose an efficient query strategy, termed as DeLR, that Decoupling the Localization and Recognition for active query. In this way, we are probably free of class annotations when the localization is correct, and able to assign the labeling budget for more informative samples. There are two main differences in DeLR: 1) Unlike previous methods mostly focus on image-level annotations, where the queried samples are selected and exhausted annotated. In DeLR, the query is based on region-level, and we only annotate the object region that is queried; 2) Instead of directly providing both localization and recognition annotations, we separately query the two components, and thus reduce the recognition budget with the pseudo class labels provided by the model. Experiments on several benchmarks demonstrate its superiority. We hope our proposed query strategy would shed light on researches in active learning in object detection.

Related papers

Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes. Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes. We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z)
Few-Shot Object Detection with Sparse Context Transformers [37.106378859592965]
Few-shot detection is a major task in pattern recognition which seeks to localize objects using models trained with few labeled data. We propose a novel sparse context transformer (SCT) that effectively leverages object knowledge in the source domain, and automatically learns a sparse context from only few training images in the target domain. We evaluate the proposed method on two challenging few-shot object detection benchmarks, and empirical results show that the proposed method obtains competitive performance compared to the related state-of-the-art.
arXiv Detail & Related papers (2024-02-14T17:10:01Z)
Deep Active Learning with Noisy Oracle in Object Detection [5.5165579223151795]
We propose a composite active learning framework including a label review module for deep object detection. We show that utilizing part of the annotation budget to correct the noisy annotations partially in the active dataset leads to early improvements in model performance. In our experiments we achieve improvements of up to 4.5 mAP points of object detection performance by incorporating label reviews at equal annotation budget.
arXiv Detail & Related papers (2023-09-30T13:28:35Z)
Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images. This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner. Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z)
The Overlooked Classifier in Human-Object Interaction Recognition [82.20671129356037]
We encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs. We propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset. Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin.
arXiv Detail & Related papers (2022-03-10T23:35:00Z)
Learning to Detect Instance-level Salient Objects Using Complementary Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem. We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z)
Towards Accurate Localization by Instance Search [2.0539994999823334]
A self-paced learning framework is proposed to achieve accurate object localization on the rank list returned by instance search. The proposed framework mines the target instance gradually from the queries and their corresponding top-ranked search results. In addition to performing localization on instance search, the issue of few-shot object detection is also addressed under the same framework.
arXiv Detail & Related papers (2021-07-11T10:03:31Z)
Region Comparison Network for Interpretable Few-shot Image Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes. We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works. We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
Cross-Supervised Object Detection [42.783400918552765]
We show how to build better object detectors from weakly labeled images of new categories by leveraging knowledge learned from fully labeled base categories. We propose a unified framework that combines a detection head trained from instance-level annotations and a recognition head learned from image-level annotations.
arXiv Detail & Related papers (2020-06-26T15:33:48Z)
Pairwise Similarity Knowledge Transfer for Weakly Supervised Object Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels. In this work, we argue that learning only an objectness function is a weak form of knowledge transfer. Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
Weakly-supervised Object Localization for Few-shot Learning and Fine-grained Few-shot Learning [0.5156484100374058]
Few-shot learning aims to learn novel visual categories from very few samples. We propose a Self-Attention Based Complementary Module (SAC Module) to fulfill the weakly-supervised object localization. We also produce the activated masks for selecting discriminative deep descriptors for few-shot classification.
arXiv Detail & Related papers (2020-03-02T14:07:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.