Open-World Weakly-Supervised Object Localization
- URL: http://arxiv.org/abs/2304.08271v2
- Date: Wed, 19 Apr 2023 05:25:43 GMT
- Title: Open-World Weakly-Supervised Object Localization
- Authors: Jinheng Xie and Zhaochuan Luo and Yuexiang Li and Haozhe Liu and
Linlin Shen and Mike Zheng Shou
- Abstract summary: We introduce a new weakly-supervised object localization task called OWSOL (Open-World Weakly-Supervised Object localization)
We propose a novel paradigm of contrastive representation co-learning using both labeled and unlabeled data to generate a complete G-CAM for object localization.
We re-organize two widely used datasets, i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as evaluation benchmarks for OWSOL.
- Score: 26.531408294517416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While remarkable success has been achieved in weakly-supervised object
localization (WSOL), current frameworks are not capable of locating objects of
novel categories in open-world settings. To address this issue, we are the
first to introduce a new weakly-supervised object localization task called
OWSOL (Open-World Weakly-Supervised Object Localization). During training, all
labeled data comes from known categories and, both known and novel categories
exist in the unlabeled data. To handle such data, we propose a novel paradigm
of contrastive representation co-learning using both labeled and unlabeled data
to generate a complete G-CAM (Generalized Class Activation Map) for object
localization, without the requirement of bounding box annotation. As no class
label is available for the unlabelled data, we conduct clustering over the full
training set and design a novel multiple semantic centroids-driven contrastive
loss for representation learning. We re-organize two widely used datasets,
i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as
evaluation benchmarks for OWSOL. Extensive experiments demonstrate that the
proposed method can surpass all baselines by a large margin. We believe that
this work can shift the close-set localization towards the open-world setting
and serve as a foundation for subsequent works. Code will be released at
https://github.com/ryylcc/OWSOL.
Related papers
- Automated Label Placement on Maps via Large Language Models [3.7553323195283697]
We introduce a new paradigm for automatic label placement (ALP) that formulates the task as a data editing problem.<n>To support this direction, we curate MAPLE, the first known benchmarking dataset for evaluating ALP on real-world maps.<n>We evaluate four open-source LLMs on MAPLE, analyzing both overall performance and generalization across different types of landmarks.
arXiv Detail & Related papers (2025-07-29T18:00:22Z) - ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations [7.0524023948087375]
We introduce ADAM: Autonomous Discovery and Model, a training-free, self-refining framework for open-world object labeling.<n> ADAM generates candidate labels for unknown objects based on contextual information from known entities within a scene.<n> ADAM retrieves visually similar instances from an Embedding-Label Repository and applies frequency-based voting and cross-modal re-ranking to assign a robust label.
arXiv Detail & Related papers (2025-06-10T16:41:33Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Learning to Discover and Detect Objects [43.52208526783969]
We tackle the problem of novel class discovery, detection, and localization (NCDL)
In this setting, we assume a source dataset with labels for objects of commonly observed classes.
By training our detection network with this objective in an end-to-end manner, it learns to classify all region proposals for a large variety of classes.
arXiv Detail & Related papers (2022-10-19T17:59:55Z) - Exploiting Unlabeled Data with Vision and Language Models for Object
Detection [64.94365501586118]
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets.
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images.
We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection and semi-supervised object detection.
arXiv Detail & Related papers (2022-07-18T21:47:15Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - Novel Visual Category Discovery with Dual Ranking Statistics and Mutual
Knowledge Distillation [16.357091285395285]
We tackle the problem of grouping unlabelled images from new classes into different semantic partitions.
This is a more realistic and challenging setting than conventional semi-supervised learning.
We propose a two-branch learning framework for this problem, with one branch focusing on local part-level information and the other branch focusing on overall characteristics.
arXiv Detail & Related papers (2021-07-07T17:14:40Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z) - Rethinking the Route Towards Weakly Supervised Object Localization [28.90792512056726]
We show that weakly supervised object localization should be divided into two parts: class-agnostic object localization and object classification.
For class-agnostic object localization, we should use class-agnostic methods to generate noisy pseudo annotations and then perform bounding box regression on them without class labels.
Our PSOL models have good transferability across different datasets without fine-tuning.
arXiv Detail & Related papers (2020-02-26T08:54:20Z) - Semi-Supervised Class Discovery [7.123519086758813]
We introduce the dataset Reconstruction Accuracy, a new and important measure of the effectiveness of a model's ability to create labels.
We apply a new, class learnability, for deciding whether a class is worthy of addition to the training dataset.
We show that our class discovery system can be successfully applied to vision and language.
arXiv Detail & Related papers (2020-02-10T00:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.