Prototypical Region Proposal Networks for Few-Shot Localization and
Classification
- URL: http://arxiv.org/abs/2104.03496v1
- Date: Thu, 8 Apr 2021 04:03:30 GMT
- Title: Prototypical Region Proposal Networks for Few-Shot Localization and
Classification
- Authors: Elliott Skomski, Aaron Tuor, Andrew Avila, Lauren Phillips, Zachary
New, Henry Kvinge, Courtney D. Corley, and Nathan Hodas
- Abstract summary: We develop a framework to unifysegmentation and classification into an end-to-end classification model -- PRoPnet.
We empirically demonstrate that our methods improve accuracy on image datasets with natural scenes containing multiple object classes.
- Score: 1.5100087942838936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently proposed few-shot image classification methods have generally
focused on use cases where the objects to be classified are the central subject
of images. Despite success on benchmark vision datasets aligned with this use
case, these methods typically fail on use cases involving densely-annotated,
busy images: images common in the wild where objects of relevance are not the
central subject, instead appearing potentially occluded, small, or among other
incidental objects belonging to other classes of potential interest. To
localize relevant objects, we employ a prototype-based few-shot segmentation
model which compares the encoded features of unlabeled query images with
support class centroids to produce region proposals indicating the presence and
location of support set classes in a query image. These region proposals are
then used as additional conditioning input to few-shot image classifiers. We
develop a framework to unify the two stages (segmentation and classification)
into an end-to-end classification model -- PRoPnet -- and empirically
demonstrate that our methods improve accuracy on image datasets with natural
scenes containing multiple object classes.
Related papers
- Exploiting Unlabeled Data with Vision and Language Models for Object
Detection [64.94365501586118]
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets.
We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images.
We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection and semi-supervised object detection.
arXiv Detail & Related papers (2022-07-18T21:47:15Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Improving Few-shot Learning with Weakly-supervised Object Localization [24.3569501375842]
We propose a novel framework that generates class representations by extracting features from class-relevant regions of the images.
Our method outperforms the baseline few-shot model in miniImageNet and tieredImageNet benchmarks.
arXiv Detail & Related papers (2021-05-25T07:39:32Z) - Sketch-Guided Object Localization in Natural Images [16.982683600384277]
We introduce the novel problem of localizing all instances of an object (seen or unseen during training) in a natural image via sketch query.
We propose a novel cross-modal attention scheme that guides the region proposal network (RPN) to generate object proposals relevant to the sketch query.
Our method is effective with as little as a single sketch query.
arXiv Detail & Related papers (2020-08-14T19:35:56Z) - Weakly-Supervised Semantic Segmentation via Sub-category Exploration [73.03956876752868]
We propose a simple yet effective approach to enforce the network to pay attention to other parts of an object.
Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class.
We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2020-08-03T20:48:31Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z) - Weakly-supervised Object Localization for Few-shot Learning and
Fine-grained Few-shot Learning [0.5156484100374058]
Few-shot learning aims to learn novel visual categories from very few samples.
We propose a Self-Attention Based Complementary Module (SAC Module) to fulfill the weakly-supervised object localization.
We also produce the activated masks for selecting discriminative deep descriptors for few-shot classification.
arXiv Detail & Related papers (2020-03-02T14:07:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.