Accelerating the Training of Video Super-Resolution
- URL: http://arxiv.org/abs/2205.05069v1
- Date: Tue, 10 May 2022 17:55:24 GMT
- Title: Accelerating the Training of Video Super-Resolution
- Authors: Lijian Lin, Xintao Wang, Zhongang Qi, Ying Shan
- Abstract summary: We show that it is possible to gradually train video models from small to large spatial/temporal sizes in an easy-to-hard manner.
Our method is capable of largely speeding up training (up to $6.2times$ speedup in wall-clock training time) without performance drop for various VSR models.
- Score: 26.449738545078986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite that convolution neural networks (CNN) have recently demonstrated
high-quality reconstruction for video super-resolution (VSR), efficiently
training competitive VSR models remains a challenging problem. It usually takes
an order of magnitude more time than training their counterpart image models,
leading to long research cycles. Existing VSR methods typically train models
with fixed spatial and temporal sizes from beginning to end. The fixed sizes
are usually set to large values for good performance, resulting to slow
training. However, is such a rigid training strategy necessary for VSR? In this
work, we show that it is possible to gradually train video models from small to
large spatial/temporal sizes, i.e., in an easy-to-hard manner. In particular,
the whole training is divided into several stages and the earlier stage has
smaller training spatial shape. Inside each stage, the temporal size also
varies from short to long while the spatial size remains unchanged. Training is
accelerated by such a multigrid training strategy, as most of computation is
performed on smaller spatial and shorter temporal shapes. For further
acceleration with GPU parallelization, we also investigate the large minibatch
training without the loss in accuracy. Extensive experiments demonstrate that
our method is capable of largely speeding up training (up to $6.2\times$
speedup in wall-clock training time) without performance drop for various VSR
models. The code is available at
https://github.com/TencentARC/Efficient-VSR-Training.
Related papers
- Test-Time Training Done Right [61.8429380523577]
Test-Time Training (TTT) models context by adapting part of the model's weights (referred to as fast weights) during inference.<n>Existing TTT methods struggled to show effectiveness in handling long-context data.<n>We develop Large Chunk Test-Time Training (LaCT) which improves hardware utilization by orders of magnitude.
arXiv Detail & Related papers (2025-05-29T17:50:34Z) - FastFace: Fast-converging Scheduler for Large-scale Face Recognition Training with One GPU [10.656812733659514]
We present FastFace, a fast-converging scheduler with negligible time complexity.
In practice, FastFace is able to accelerate Face Recognition model training to a quarter of its original time without sacrificing more than 1% accuracy.
arXiv Detail & Related papers (2024-04-17T07:06:22Z) - Time-series Initialization and Conditioning for Video-agnostic Stabilization of Video Super-Resolution using Recurrent Networks [13.894981567082997]
A Recurrent Neural Network (RNN) for Video Super Resolution (VSR) is generally trained with randomly clipped and cropped short videos.
Since this RNN is optimized to super-resolve short videos, VSR of long videos is degraded due to the domain gap.
This paper proposes a training strategy of RNN for VSR that can work efficiently and stably independently of the video length and dynamics.
arXiv Detail & Related papers (2024-03-23T13:16:07Z) - Time-, Memory- and Parameter-Efficient Visual Adaptation [75.28557015773217]
We propose an adaptation method which does not backpropagate gradients through the backbone.
We achieve this by designing a lightweight network in parallel that operates on features from the frozen, pretrained backbone.
arXiv Detail & Related papers (2024-02-05T10:55:47Z) - Always-Sparse Training by Growing Connections with Guided Stochastic
Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models.
We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z) - Towards Memory- and Time-Efficient Backpropagation for Training Spiking
Neural Networks [70.75043144299168]
Spiking Neural Networks (SNNs) are promising energy-efficient models for neuromorphic computing.
We propose the Spatial Learning Through Time (SLTT) method that can achieve high performance while greatly improving training efficiency.
Our method achieves state-of-the-art accuracy on ImageNet, while the memory cost and training time are reduced by more than 70% and 50%, respectively, compared with BPTT.
arXiv Detail & Related papers (2023-02-28T05:01:01Z) - Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch
Size [58.762959061522736]
We show that scaling mini-batch sizes with appropriate learning rate adjustments can speed up the training process by orders of magnitude.
We show that scaling the mini-batch size and naively adjusting the learning rate allows for (1) a reduced size of the Q-ensemble, (2) stronger penalization of out-of-distribution actions, and (3) improved convergence time.
arXiv Detail & Related papers (2022-11-20T21:48:25Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - Investigating Tradeoffs in Real-World Video Super-Resolution [90.81396836308085]
Real-world video super-resolution (VSR) models are often trained with diverse degradations to improve generalizability.
To alleviate the first tradeoff, we propose a degradation scheme that reduces up to 40% of training time without sacrificing performance.
To facilitate fair comparisons, we propose the new VideoLQ dataset, which contains a large variety of real-world low-quality video sequences.
arXiv Detail & Related papers (2021-11-24T18:58:21Z) - Automated Learning Rate Scheduler for Large-batch Training [24.20872850681828]
Large-batch training has been essential in leveraging large-scale datasets and models in deep learning.
It often requires a specially designed learning rate (LR) schedule to achieve a comparable level of performance as in smaller batch training.
We propose an automated LR scheduling algorithm which is effective for neural network training with a large batch size under the given epoch budget.
arXiv Detail & Related papers (2021-07-13T05:23:13Z) - Layered gradient accumulation and modular pipeline parallelism: fast and
efficient training of large language models [0.0]
We analyse the shortest possible training time for different configurations of distributed training.
We introduce two new methods, textitlayered gradient accumulation and textitmodular pipeline parallelism, which together cut the shortest training time by half.
arXiv Detail & Related papers (2021-06-04T19:21:49Z) - Spatiotemporal Contrastive Video Representation Learning [87.56145031149869]
We present a self-supervised Contrastive Video Representation Learning (CVRL) method to learn visual representations from unlabeled videos.
Our representations are learned using a contrasttemporalive loss, where two augmented clips from the same short video are pulled together in the embedding space.
We study what makes for good data augmentations for video self-supervised learning and find that both spatial and temporal information are crucial.
arXiv Detail & Related papers (2020-08-09T19:58:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.