UnseenNet: Fast Training Detector for Any Unseen Concept
- URL: http://arxiv.org/abs/2203.08759v1
- Date: Wed, 16 Mar 2022 17:17:10 GMT
- Title: UnseenNet: Fast Training Detector for Any Unseen Concept
- Authors: Asra Aslam and Edward Curry
- Abstract summary: "Unseen Class Detector" can be trained within a very short time for any possible unseen class without bounding boxes with competitive accuracy.
Our model (UnseenNet) is trained on the ImageNet classification dataset for unseen classes and tested on an object detection dataset (OpenImages)
- Score: 6.802401545890963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training of object detection models using less data is currently the focus of
existing N-shot learning models in computer vision. Such methods use
object-level labels and takes hours to train on unseen classes. There are many
cases where we have large amount of image-level labels available for training
but cannot be utilized by few shot object detection models for training. There
is a need for a machine learning framework that can be used for training any
unseen class and can become useful in real-time situations. In this paper, we
proposed an "Unseen Class Detector" that can be trained within a very short
time for any possible unseen class without bounding boxes with competitive
accuracy. We build our approach on "Strong" and "Weak" baseline detectors,
which we trained on existing object detection and image classification
datasets, respectively. Unseen concepts are fine-tuned on the strong baseline
detector using only image-level labels and further adapted by transferring the
classifier-detector knowledge between baselines. We use semantic as well as
visual similarities to identify the source class (i.e. Sheep) for the
fine-tuning and adaptation of unseen class (i.e. Goat). Our model (UnseenNet)
is trained on the ImageNet classification dataset for unseen classes and tested
on an object detection dataset (OpenImages). UnseenNet improves the mean
average precision (mAP) by 10% to 30% over existing baselines (semi-supervised
and few-shot) of object detection on different unseen class splits. Moreover,
training time of our model is <10 min for each unseen class. Qualitative
results demonstrate that UnseenNet is suitable not only for few classes of
Pascal VOC but for unseen classes of any dataset or web. Code is available at
https://github.com/Asra-Aslam/UnseenNet.
Related papers
- A High-Resolution Dataset for Instance Detection with Multi-View
Instance Capture [15.298790238028356]
Instance detection (InsDet) is a long-lasting problem in robotics and computer vision.
Current InsDet are too small in scale by today's standards.
We introduce a new InsDet dataset and protocol.
arXiv Detail & Related papers (2023-10-30T03:58:41Z) - Image-free Classifier Injection for Zero-Shot Classification [72.66409483088995]
Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training.
We aim to equip pre-trained models with zero-shot classification capabilities without the use of image data.
We achieve this with our proposed Image-free Injection with Semantics (ICIS)
arXiv Detail & Related papers (2023-08-21T09:56:48Z) - Single Image Object Counting and Localizing using Active-Learning [4.56877715768796]
We present a new method for counting and localizing repeating objects in single-image scenarios.
Our method trains a CNN over a small set of labels carefully collected from the input image in few active-learning iterations.
Compared with existing user-assisted counting methods, our active-learning iterations achieve state-of-the-art performance in terms of counting and localizing accuracy, number of user mouse clicks, and running-time.
arXiv Detail & Related papers (2021-11-16T11:29:21Z) - Experience feedback using Representation Learning for Few-Shot Object
Detection on Aerial Images [2.8560476609689185]
The performance of our method is assessed on DOTA, a large-scale remote sensing images dataset.
It highlights in particular some intrinsic weaknesses for the few-shot object detection task.
arXiv Detail & Related papers (2021-09-27T13:04:53Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Learning Transferable Visual Models From Natural Language Supervision [13.866297967166089]
Learning directly from raw text about images is a promising alternative.
We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn.
SOTA image representations are learned from scratch on a dataset of 400 million (image, text) pairs collected from the internet.
arXiv Detail & Related papers (2021-02-26T19:04:58Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Self-Supervised Viewpoint Learning From Image Collections [116.56304441362994]
We propose a novel learning framework which incorporates an analysis-by-synthesis paradigm to reconstruct images in a viewpoint aware manner.
We show that our approach performs competitively to fully-supervised approaches for several object categories like human faces, cars, buses, and trains.
arXiv Detail & Related papers (2020-04-03T22:01:41Z) - StarNet: towards Weakly Supervised Few-Shot Object Detection [87.80771067891418]
We introduce StarNet - a few-shot model featuring an end-to-end differentiable non-parametric star-model detection and classification head.
Through this head, the backbone is meta-trained using only image-level labels to produce good features for jointly localizing and classifying previously unseen categories of few-shot test tasks.
Being a few-shot detector, StarNet does not require any bounding box annotations, neither during pre-training nor for novel classes adaptation.
arXiv Detail & Related papers (2020-03-15T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.