OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features
- URL: http://arxiv.org/abs/2003.06800v2
- Date: Wed, 19 Aug 2020 13:59:00 GMT
- Title: OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features
- Authors: Anton Osokin, Denis Sumin, Vasily Lomakin
- Abstract summary: One-shot object detection consists in detecting objects defined by a single demonstration.
We build the one-stage system that performs localization and recognition jointly.
Experimental evaluation on several challenging domains shows that our method can detect unseen classes.
- Score: 14.115782214599015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider the task of one-shot object detection, which
consists in detecting objects defined by a single demonstration. Differently
from the standard object detection, the classes of objects used for training
and testing do not overlap. We build the one-stage system that performs
localization and recognition jointly. We use dense correlation matching of
learned local features to find correspondences, a feed-forward geometric
transformation model to align features and bilinear resampling of the
correlation tensor to compute the detection score of the aligned features. All
the components are differentiable, which allows end-to-end training.
Experimental evaluation on several challenging domains (retail products, 3D
objects, buildings and logos) shows that our method can detect unseen classes
(e.g., toothpaste when trained on groceries) and outperforms several baselines
by a significant margin. Our code is available online:
https://github.com/aosokin/os2d .
Related papers
- PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.
We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - On Hyperbolic Embeddings in 2D Object Detection [76.12912000278322]
We study whether a hyperbolic geometry better matches the underlying structure of the object classification space.
We incorporate a hyperbolic classifier in two-stage, keypoint-based, and transformer-based object detection architectures.
We observe categorical class hierarchies emerging in the structure of the classification space, resulting in lower classification errors and boosting the overall object detection performance.
arXiv Detail & Related papers (2022-03-15T16:43:40Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z) - Modulating Localization and Classification for Harmonized Object
Detection [40.82723262074911]
We propose a mutual learning framework to modulate the two tasks.
In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy.
We achieve a significant performance gain over the baseline detectors on the COCO dataset.
arXiv Detail & Related papers (2021-03-16T10:36:02Z) - Unsupervised Part Discovery via Feature Alignment [15.67978793872039]
We exploit the property that neural network features are largely invariant to nuisance variables.
We find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.
During inference, part detection is simple and fast, without any extra modules or overheads other than a feed-forward neural network.
arXiv Detail & Related papers (2020-12-01T07:25:00Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Attention-based Joint Detection of Object and Semantic Part [4.389917490809522]
Our model is created on top of two Faster-RCNN models that share their features to get enhanced representations of both.
Experiments on the PASCAL-Part 2010 dataset show that joint detection can simultaneously improve both object detection and part detection.
arXiv Detail & Related papers (2020-07-05T18:54:10Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.