Automated Progressive Learning for Efficient Training of Vision
Transformers
- URL: http://arxiv.org/abs/2203.14509v1
- Date: Mon, 28 Mar 2022 05:37:08 GMT
- Title: Automated Progressive Learning for Efficient Training of Vision
Transformers
- Authors: Changlin Li, Bohan Zhuang, Guangrun Wang, Xiaodan Liang, Xiaojun
Chang, Yi Yang
- Abstract summary: Vision Transformers (ViTs) have come with a voracious appetite for computing power, high-lighting the urgent need to develop efficient training methods for ViTs.
Progressive learning, a training scheme where the model capacity grows progressively during training, has started showing its ability in efficient training.
In this paper, we take a practical step towards efficient training of ViTs by customizing and automating progressive learning.
- Score: 125.22744987949227
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in vision Transformers (ViTs) have come with a voracious
appetite for computing power, high-lighting the urgent need to develop
efficient training methods for ViTs. Progressive learning, a training scheme
where the model capacity grows progressively during training, has started
showing its ability in efficient training. In this paper, we take a practical
step towards efficient training of ViTs by customizing and automating
progressive learning. First, we develop a strong manual baseline for
progressive learning of ViTs, by introducing momentum growth (MoGrow) to bridge
the gap brought by model growth. Then, we propose automated progressive
learning (AutoProg), an efficient training scheme that aims to achieve lossless
acceleration by automatically increasing the training overload on-the-fly; this
is achieved by adaptively deciding whether, where and how much should the model
grow during progressive learning. Specifically, we first relax the optimization
of the growth schedule to sub-network architecture optimization problem, then
propose one-shot estimation of the sub-network performance via an elastic
supernet. The searching overhead is reduced to minimal by recycling the
parameters of the supernet. Extensive experiments of efficient training on
ImageNet with two representative ViT models, DeiT and VOLO, demonstrate that
AutoProg can accelerate ViTs training by up to 85.1% with no performance drop.
Code: https://github.com/changlin31/AutoProg
Related papers
- T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design [79.7289790249621]
Our proposed method, T2V-Turbo-v2, introduces a significant advancement by integrating various supervision signals.
We highlight the crucial importance of tailoring datasets to specific learning objectives.
We demonstrate the potential of this approach by extracting motion guidance from the training datasets and incorporating it into the ODE solver.
arXiv Detail & Related papers (2024-10-08T04:30:06Z) - Efficient Training of Large Vision Models via Advanced Automated Progressive Learning [96.71646528053651]
We present an advanced automated progressive learning (AutoProg) framework for efficient training of Large Vision Models (LVMs)
We introduce AutoProg-Zero, by enhancing the AutoProg framework with a novel zero-shot unfreezing schedule search.
Experiments show that AutoProg accelerates ViT pre-training by up to 1.85x on ImageNet and accelerates fine-tuning of diffusion models by up to 2.86x, with comparable or even higher performance.
arXiv Detail & Related papers (2024-09-06T16:24:24Z) - A General and Efficient Training for Transformer via Token Expansion [44.002355107931805]
Vision Transformers (ViTs) typically require an extremely large training cost.
Existing methods have attempted to accelerate the training of ViTs, yet typically disregard method with accuracy dropping.
We propose a novel token growth scheme Token Expansion (termed ToE) to achieve consistent training acceleration for ViTs.
arXiv Detail & Related papers (2024-03-31T12:44:24Z) - Efficient Stagewise Pretraining via Progressive Subnetworks [53.00045381931778]
The prevailing view suggests that stagewise dropping strategies, such as layer dropping, are ineffective when compared to stacking-based approaches.
This paper challenges this notion by demonstrating that, with proper design, dropping strategies can be competitive, if not better, than stacking methods.
We propose an instantiation of this framework - Random Part Training (RAPTR) - that selects and trains only a random subnetwork at each step, progressively increasing the size in stages.
arXiv Detail & Related papers (2024-02-08T18:49:09Z) - Local Masking Meets Progressive Freezing: Crafting Efficient Vision
Transformers for Self-Supervised Learning [0.0]
We present an innovative approach to self-supervised learning for Vision Transformers (ViTs)
This method focuses on enhancing the efficiency and speed of initial layer training in ViTs.
Our approach employs a novel multi-scale reconstruction process that fosters efficient learning in initial layers.
arXiv Detail & Related papers (2023-12-02T11:10:09Z) - Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents.
We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead.
Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z) - Auto-scaling Vision Transformers without Training [84.34662535276898]
We propose As-ViT, an auto-scaling framework for Vision Transformers (ViTs) without training.
As-ViT automatically discovers and scales up ViTs in an efficient and principled manner.
As a unified framework, As-ViT achieves strong performance on classification and detection.
arXiv Detail & Related papers (2022-02-24T06:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.