Closing the Generalization Gap in One-Shot Object Detection
- URL: http://arxiv.org/abs/2011.04267v1
- Date: Mon, 9 Nov 2020 09:31:17 GMT
- Title: Closing the Generalization Gap in One-Shot Object Detection
- Authors: Claudio Michaelis, Matthias Bethge, Alexander S. Ecker
- Abstract summary: We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories.
Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
- Score: 92.82028853413516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite substantial progress in object detection and few-shot learning,
detecting objects based on a single example - one-shot object detection -
remains a challenge: trained models exhibit a substantial generalization gap,
where object categories used during training are detected much more reliably
than novel ones. Here we show that this generalization gap can be nearly closed
by increasing the number of object categories used during training. Our results
show that the models switch from memorizing individual categories to learning
object similarity over the category distribution, enabling strong
generalization at test time. Importantly, in this regime standard methods to
improve object detection models like stronger backbones or longer training
schedules also benefit novel categories, which was not the case for smaller
datasets like COCO. Our results suggest that the key to strong few-shot
detection models may not lie in sophisticated metric learning approaches, but
instead in scaling the number of categories. Future data annotation efforts
should therefore focus on wider datasets and annotate a larger number of
categories rather than gathering more images or instances per category.
Related papers
- Object Occlusion of Adding New Categories in Objection Detection [4.014524824655107]
Building instance detection models that are data efficient and can handle rare object categories is an important challenge in computer vision.
Here, we perform a systematic study of the Object Occlusion data collection and augmentation methods.
We illustate that only adding 15 images of new category in a half million training dataset with hundreds categories, can give this new category 95% accuracy in unseen test dataset.
arXiv Detail & Related papers (2022-06-12T13:02:23Z) - Few-Shot Object Detection: A Survey [4.266990593059534]
Few-shot object detection aims to learn from few object instances of new categories in the target domain.
We categorize approaches according to their training scheme and architectural layout.
We introduce commonly used datasets and their evaluation protocols and analyze reported benchmark results.
arXiv Detail & Related papers (2021-12-22T07:08:53Z) - Bridging Non Co-occurrence with Unlabeled In-the-wild Data for
Incremental Object Detection [56.22467011292147]
Several incremental learning methods are proposed to mitigate catastrophic forgetting for object detection.
Despite the effectiveness, these methods require co-occurrence of the unlabeled base classes in the training data of the novel classes.
We propose the use of unlabeled in-the-wild data to bridge the non-occurrence caused by the missing base classes during the training of additional novel classes.
arXiv Detail & Related papers (2021-10-28T10:57:25Z) - Scaling Laws for the Few-Shot Adaptation of Pre-trained Image
Classifiers [11.408339220607251]
Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning.
Our main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers.
arXiv Detail & Related papers (2021-10-13T19:07:01Z) - On Model Calibration for Long-Tailed Object Detection and Instance
Segmentation [56.82077636126353]
We propose NorCal, Normalized for long-tailed object detection and instance segmentation.
We show that separately handling the background class and normalizing the scores over classes for each proposal are keys to achieving superior performance.
arXiv Detail & Related papers (2021-07-05T17:57:20Z) - Towards A Category-extended Object Detector without Relabeling or
Conflicts [40.714221493482974]
In this paper, we aim at leaning a strong unified detector that can handle all categories based on the limited datasets without extra manual labor.
We propose a practical framework which focuses on three aspects: better base model, better unlabeled ground-truth mining strategy and better retraining method with pseudo annotations.
arXiv Detail & Related papers (2020-12-28T06:44:53Z) - Few-Shot Object Detection via Knowledge Transfer [21.3564383157159]
Conventional methods for object detection usually require substantial amounts of training data and annotated bounding boxes.
In this paper, we introduce a few-shot object detection via knowledge transfer, which aims to detect objects from a few training examples.
arXiv Detail & Related papers (2020-08-28T06:35:27Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - StarNet: towards Weakly Supervised Few-Shot Object Detection [87.80771067891418]
We introduce StarNet - a few-shot model featuring an end-to-end differentiable non-parametric star-model detection and classification head.
Through this head, the backbone is meta-trained using only image-level labels to produce good features for jointly localizing and classifying previously unseen categories of few-shot test tasks.
Being a few-shot detector, StarNet does not require any bounding box annotations, neither during pre-training nor for novel classes adaptation.
arXiv Detail & Related papers (2020-03-15T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.