VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning
Challenges
- URL: http://arxiv.org/abs/2305.19688v1
- Date: Wed, 31 May 2023 09:31:54 GMT
- Title: VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning
Challenges
- Authors: Robert-Jan Bruintjes, Attila Lengyel, Marcos Baptista Rios, Osman
Semih Kayhan, Davide Zambrano, Nergis Tomen and Jan van Gemert
- Abstract summary: Third edition of the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" workshop featured four data-impaired challenges.
Challenges focused on addressing the limitations of data availability in training deep learning models for computer vision tasks.
- Score: 13.085098213230568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The third edition of the "VIPriors: Visual Inductive Priors for
Data-Efficient Deep Learning" workshop featured four data-impaired challenges,
focusing on addressing the limitations of data availability in training deep
learning models for computer vision tasks. The challenges comprised of four
distinct data-impaired tasks, where participants were required to train models
from scratch using a reduced number of training samples. The primary objective
was to encourage novel approaches that incorporate relevant inductive biases to
enhance the data efficiency of deep learning models. To foster creativity and
exploration, participants were strictly prohibited from utilizing pre-trained
checkpoints and other transfer learning techniques. Significant advancements
were made compared to the provided baselines, where winning solutions surpassed
the baselines by a considerable margin in all four tasks. These achievements
were primarily attributed to the effective utilization of extensive data
augmentation policies, model ensembling techniques, and the implementation of
data-efficient training methods, including self-supervised representation
learning. This report highlights the key aspects of the challenges and their
outcomes.
Related papers
- Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning Challenges [12.615348941903594]
Fourth edition of the "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" workshop features two data-impaired challenges.
These challenges address the problem of training deep learning models for computer vision tasks with limited data.
We aim to stimulate the development of novel approaches that incorporate inductive biases to improve the data efficiency of deep learning models.
arXiv Detail & Related papers (2024-06-26T08:50:51Z) - Less is More: High-value Data Selection for Visual Instruction Tuning [127.38740043393527]
We propose a high-value data selection approach TIVE, to eliminate redundancy within the visual instruction data and reduce the training cost.
Our approach using only about 15% data can achieve comparable average performance to the full-data fine-tuned model across eight benchmarks.
arXiv Detail & Related papers (2024-03-14T16:47:25Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - On Efficient Training of Large-Scale Deep Learning Models: A Literature
Review [90.87691246153612]
The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.
The use of large-scale models trained on vast amounts of data holds immense promise for practical applications.
With the increasing demands on computational capacity, a comprehensive summarization on acceleration techniques of training deep learning models is still much anticipated.
arXiv Detail & Related papers (2023-04-07T11:13:23Z) - VIPriors 2: Visual Inductive Priors for Data-Efficient Deep Learning
Challenges [13.085098213230568]
Second edition of "VIPriors: Visual Inductive Priors for Data-Efficient Deep Learning" challenges.
Models are trained from scratch on a reduced number of training samples for various key computer vision tasks.
Results: The provided baselines are outperformed by a large margin in all five challenges.
arXiv Detail & Related papers (2022-01-21T10:20:52Z) - Self-Supervised Representation Learning: Introduction, Advances and
Challenges [125.38214493654534]
Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets.
This article introduces this vibrant area including key concepts, the four main families of approach and associated state of the art, and how self-supervised methods are applied to diverse modalities of data.
arXiv Detail & Related papers (2021-10-18T13:51:22Z) - VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning
Challenges [8.50468505606714]
We offer four data-impaired challenges, where models are trained from scratch, and we reduce the number of training samples to a fraction of the full set.
To encourage data efficient solutions, we prohibited the use of pre-trained models and other transfer learning techniques.
arXiv Detail & Related papers (2021-03-05T15:58:17Z) - Data-Efficient Deep Learning Method for Image Classification Using Data
Augmentation, Focal Cosine Loss, and Ensemble [9.55617552923003]
It is important to leverage small dataset effectively for achieving the better performance.
With these methods, we obtain high accuracy by leveraging ImageNet data which consist of only 50 images per class.
Our model is ranked 4th in Visual Inductive Printers for Data-Effective Computer Vision Challenge.
arXiv Detail & Related papers (2020-07-15T16:30:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.