Visual Recognition by Request
- URL: http://arxiv.org/abs/2207.14227v1
- Date: Thu, 28 Jul 2022 16:55:11 GMT
- Title: Visual Recognition by Request
- Authors: Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian
- Abstract summary: We present a novel protocol of annotation and evaluation for visual recognition.
It does not require the labeler/algorithm to annotate/recognize all targets (objects, parts, etc.) at once, but instead raises a number of recognition instructions and the algorithm recognizes targets by request.
We evaluate the recognition system on two mixed-annotated datasets, CPP and ADE20K, and demonstrate its promising ability of learning from partially labeled data.
- Score: 111.94887516317735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a novel protocol of annotation and evaluation for
visual recognition. Different from traditional settings, the protocol does not
require the labeler/algorithm to annotate/recognize all targets (objects,
parts, etc.) at once, but instead raises a number of recognition instructions
and the algorithm recognizes targets by request. This mechanism brings two
beneficial properties to reduce the burden of annotation, namely, (i) variable
granularity: different scenarios can have different levels of annotation, in
particular, object parts can be labeled only in large and clear instances, (ii)
being open-domain: new concepts can be added to the database in minimal costs.
To deal with the proposed setting, we maintain a knowledge base and design a
query-based visual recognition framework that constructs queries on-the-fly
based on the requests. We evaluate the recognition system on two
mixed-annotated datasets, CPP and ADE20K, and demonstrate its promising ability
of learning from partially labeled data as well as adapting to new concepts
with only text labels.
Related papers
- A Generative Approach for Wikipedia-Scale Visual Entity Recognition [56.55633052479446]
We address the task of mapping a given query image to one of the 6 million existing entities in Wikipedia.
We introduce a novel Generative Entity Recognition framework, which learns to auto-regressively decode a semantic and discriminative code'' identifying the target entity.
arXiv Detail & Related papers (2024-03-04T13:47:30Z) - DeLR: Active Learning for Detection with Decoupled Localization and
Recognition Query [53.54802901197267]
In this paper, we rethink two key components, i.e., localization and recognition, for object detection.
Motivated by this, we propose an efficient query strategy, called Decoupling the localization and recognition for active query.
arXiv Detail & Related papers (2023-12-28T09:58:32Z) - Weakly Supervised Open-Vocabulary Object Detection [31.605276665964787]
We propose a novel weakly supervised open-vocabulary object detection framework, namely WSOVOD, to extend traditional WSOD.
To achieve this, we explore three vital strategies, including dataset-level feature adaptation, image-level salient object localization, and region-level vision-language alignment.
arXiv Detail & Related papers (2023-12-19T18:59:53Z) - Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - Not All Instances Contribute Equally: Instance-adaptive Class
Representation Learning for Few-Shot Visual Recognition [94.04041301504567]
Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances.
We propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition.
arXiv Detail & Related papers (2022-09-07T10:00:18Z) - The Overlooked Classifier in Human-Object Interaction Recognition [82.20671129356037]
We encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs.
We propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset.
Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin.
arXiv Detail & Related papers (2022-03-10T23:35:00Z) - Uncertainty-Aware Annotation Protocol to Evaluate Deformable
Registration Algorithms [3.2845753359072125]
We introduce a principled strategy for the construction of a gold standard in deformable registration.
Our framework: (i) iteratively suggests the most informative location to annotate next, taking into account its redundancy with previous annotations; (ii) extends traditional pointwise annotations by accounting for the spatial uncertainty of each annotation; and (iii) naturally provides a new strategy for the evaluation of deformable registration algorithms.
arXiv Detail & Related papers (2021-04-02T19:31:19Z) - Adaptive Attentional Network for Few-Shot Knowledge Graph Completion [16.722373937828117]
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs.
Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties.
This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations.
arXiv Detail & Related papers (2020-10-19T16:27:48Z) - Few-shot Learning for Multi-label Intent Detection [59.66787898744991]
State-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels.
Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.
arXiv Detail & Related papers (2020-10-11T14:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.