Towards Few-Annotation Learning in Computer Vision: Application to Image
Classification and Object Detection tasks
- URL: http://arxiv.org/abs/2311.04888v1
- Date: Wed, 8 Nov 2023 18:50:04 GMT
- Title: Towards Few-Annotation Learning in Computer Vision: Application to Image
Classification and Object Detection tasks
- Authors: Quentin Bouniot
- Abstract summary: In this thesis, we develop theoretical, algorithmic and experimental contributions for Machine Learning with limited labels.
In a first contribution, we are interested in bridging the gap between theory and practice for popular Meta-Learning algorithms used in Few-Shot Classification.
To leverage unlabeled data when training object detectors based on the Transformer architecture, we propose both an unsupervised pretraining and a semi-supervised learning method.
- Score: 3.5353632767823506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this thesis, we develop theoretical, algorithmic and experimental
contributions for Machine Learning with limited labels, and more specifically
for the tasks of Image Classification and Object Detection in Computer Vision.
In a first contribution, we are interested in bridging the gap between theory
and practice for popular Meta-Learning algorithms used in Few-Shot
Classification. We make connections to Multi-Task Representation Learning,
which benefits from solid theoretical foundations, to verify the best
conditions for a more efficient meta-learning. Then, to leverage unlabeled data
when training object detectors based on the Transformer architecture, we
propose both an unsupervised pretraining and a semi-supervised learning method
in two other separate contributions. For pretraining, we improve Contrastive
Learning for object detectors by introducing the localization information.
Finally, our semi-supervised method is the first tailored to transformer-based
detectors.
Related papers
- Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - A Unified Framework with Meta-dropout for Few-shot Learning [25.55782263169028]
In this paper, we utilize the idea of meta-learning to explain two very different streams of few-shot learning.
We propose a simple yet effective strategy named meta-dropout, which is applied to the transferable knowledge generalized from base categories to novel categories.
arXiv Detail & Related papers (2022-10-12T17:05:06Z) - CoDo: Contrastive Learning with Downstream Background Invariance for
Detection [10.608660802917214]
We propose a novel object-level self-supervised learning method, called Contrastive learning with Downstream background invariance (CoDo)
The pretext task is converted to focus on instance location modeling for various backgrounds, especially for downstream datasets.
Experiments on MSCOCO demonstrate that the proposed CoDo with common backbones, ResNet50-FPN, yields strong transfer learning results for object detection.
arXiv Detail & Related papers (2022-05-10T01:26:15Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Joint Inductive and Transductive Learning for Video Object Segmentation [107.32760625159301]
Semi-supervised object segmentation is a task of segmenting the target object in a video sequence given only a mask in the first frame.
Most previous best-performing methods adopt matching-based transductive reasoning or online inductive learning.
We propose to integrate transductive and inductive learning into a unified framework to exploit complement between them for accurate and robust video object segmentation.
arXiv Detail & Related papers (2021-08-08T16:25:48Z) - Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.
We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task.
Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Building Robust Industrial Applicable Object Detection Models Using
Transfer Learning and Single Pass Deep Learning Architectures [1.1816942730023883]
We explore how deep convolutional neural networks dedicated to the task of object detection can improve our industrial-oriented object detection pipelines.
By using a deep learning architecture that integrates region proposals, classification and probability estimation in a single run, we aim at obtaining real-time performance.
We apply these algorithms to two industrially relevant applications, one being the detection of promotion boards in eye tracking data and the other detecting and recognizing packages of warehouse products for augmented advertisements.
arXiv Detail & Related papers (2020-07-09T09:50:45Z) - Interpolation-based semi-supervised learning for object detection [44.37685664440632]
We propose an Interpolation-based Semi-supervised learning method for object detection.
The proposed losses dramatically improve the performance of semi-supervised learning as well as supervised learning.
arXiv Detail & Related papers (2020-06-03T10:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.