Towards Accurate Localization by Instance Search
- URL: http://arxiv.org/abs/2107.05005v1
- Date: Sun, 11 Jul 2021 10:03:31 GMT
- Title: Towards Accurate Localization by Instance Search
- Authors: Yi-Geng Hong, Hui-Chu Xiao, Wan-Lei Zhao
- Abstract summary: A self-paced learning framework is proposed to achieve accurate object localization on the rank list returned by instance search.
The proposed framework mines the target instance gradually from the queries and their corresponding top-ranked search results.
In addition to performing localization on instance search, the issue of few-shot object detection is also addressed under the same framework.
- Score: 2.0539994999823334
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual object localization is the key step in a series of object detection
tasks. In the literature, high localization accuracy is achieved with the
mainstream strongly supervised frameworks. However, such methods require
object-level annotations and are unable to detect objects of unknown
categories. Weakly supervised methods face similar difficulties. In this paper,
a self-paced learning framework is proposed to achieve accurate object
localization on the rank list returned by instance search. The proposed
framework mines the target instance gradually from the queries and their
corresponding top-ranked search results. Since a common instance is shared
between the query and the images in the rank list, the target visual instance
can be accurately localized even without knowing what the object category is.
In addition to performing localization on instance search, the issue of
few-shot object detection is also addressed under the same framework. Superior
performance over state-of-the-art methods is observed on both tasks.
Related papers
- Improving Object Detection via Local-global Contrastive Learning [27.660633883387753]
We present a novel image-to-image translation method that specifically targets cross-domain object detection.
We learn to represent objects by contrasting local-global information.
This affords investigation of an under-explored challenge: obtaining performant detection, under domain shifts.
arXiv Detail & Related papers (2024-10-07T14:18:32Z) - Generative Region-Language Pretraining for Open-Ended Object Detection [55.42484781608621]
We propose a framework named GenerateU, which can detect dense objects and generate their names in a free-form way.
Our framework achieves comparable results to the open-vocabulary object detection method GLIP.
arXiv Detail & Related papers (2024-03-15T10:52:39Z) - DeLR: Active Learning for Detection with Decoupled Localization and
Recognition Query [53.54802901197267]
In this paper, we rethink two key components, i.e., localization and recognition, for object detection.
Motivated by this, we propose an efficient query strategy, called Decoupling the localization and recognition for active query.
arXiv Detail & Related papers (2023-12-28T09:58:32Z) - Exploiting Unlabeled Data with Vision and Language Models for Object
Detection [64.94365501586118]
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets.
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images.
We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection and semi-supervised object detection.
arXiv Detail & Related papers (2022-07-18T21:47:15Z) - FindIt: Generalized Localization with Natural Language Queries [43.07139534653485]
FindIt is a simple and versatile framework that unifies a variety of visual grounding and localization tasks.
Key to our architecture is an efficient multi-scale fusion module that unifies the disparate localization requirements.
Our end-to-end trainable framework responds flexibly and accurately to a wide range of referring expression, localization or detection queries.
arXiv Detail & Related papers (2022-03-31T17:59:30Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Salient Object Ranking with Position-Preserved Attention [44.94722064885407]
We study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency.
We propose the first end-to-end framework of the SOR task and solve it in a multi-task learning fashion.
We also introduce a Position-Preserved Attention (PPA) module tailored for the SOR branch.
arXiv Detail & Related papers (2021-06-09T13:00:05Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Cross-Supervised Object Detection [42.783400918552765]
We show how to build better object detectors from weakly labeled images of new categories by leveraging knowledge learned from fully labeled base categories.
We propose a unified framework that combines a detection head trained from instance-level annotations and a recognition head learned from image-level annotations.
arXiv Detail & Related papers (2020-06-26T15:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.