MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the
Edge
- URL: http://arxiv.org/abs/2110.14032v1
- Date: Tue, 26 Oct 2021 21:15:17 GMT
- Title: MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the
Edge
- Authors: Geng Yuan, Xiaolong Ma, Wei Niu, Zhengang Li, Zhenglun Kong, Ning Liu,
Yifan Gong, Zheng Zhan, Chaoyang He, Qing Jin, Siyue Wang, Minghai Qin, Bin
Ren, Yanzhi Wang, Sijia Liu, Xue Lin
- Abstract summary: This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for accurate and fast execution on edge devices.
The proposed MEST framework consists of enhancements by Elastic Mutation (EM) and Soft Memory Bound (&S)
Our results suggest that unforgettable examples can be identified in-situ even during the dynamic exploration of sparsity masks.
- Score: 72.16021611888165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, a new trend of exploring sparsity for accelerating neural network
training has emerged, embracing the paradigm of training on the edge. This
paper proposes a novel Memory-Economic Sparse Training (MEST) framework
targeting for accurate and fast execution on edge devices. The proposed MEST
framework consists of enhancements by Elastic Mutation (EM) and Soft Memory
Bound (&S) that ensure superior accuracy at high sparsity ratios. Different
from the existing works for sparse training, this current work reveals the
importance of sparsity schemes on the performance of sparse training in terms
of accuracy as well as training speed on real edge devices. On top of that, the
paper proposes to employ data efficiency for further acceleration of sparse
training. Our results suggest that unforgettable examples can be identified
in-situ even during the dynamic exploration of sparsity masks in the sparse
training process, and therefore can be removed for further training speedup on
edge devices. Comparing with state-of-the-art (SOTA) works on accuracy, our
MEST increases Top-1 accuracy significantly on ImageNet when using the same
unstructured sparsity scheme. Systematical evaluation on accuracy, training
speed, and memory footprint are conducted, where the proposed MEST framework
consistently outperforms representative SOTA works. A reviewer strongly against
our work based on his false assumptions and misunderstandings. On top of the
previous submission, we employ data efficiency for further acceleration of
sparse training. And we explore the impact of model sparsity, sparsity schemes,
and sparse training algorithms on the number of removable training examples.
Our codes are publicly available at: https://github.com/boone891214/MEST.
Related papers
- Always-Sparse Training by Growing Connections with Guided Stochastic
Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models.
We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks [2.6742343015805083]
We propose Gradient Annealing (GA) to explore the non-uniform distribution of sparsity inherent within neural networks.
GA provides an elegant trade-off between sparsity and accuracy without the need for additional sparsity-inducing regularization.
We integrate GA with the latest learnable pruning methods to create an automated sparse training algorithm called AutoSparse.
arXiv Detail & Related papers (2023-04-14T06:19:07Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution.
Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x.
We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - FROST: Faster and more Robust One-shot Semi-supervised Training [0.0]
We present a one-shot semi-supervised learning method that trains up to an order of magnitude faster and is more robust than state-of-the-art methods.
Our experiments demonstrate FROST's capability to perform well when the composition of the unlabeled data is unknown.
arXiv Detail & Related papers (2020-11-18T18:56:03Z) - Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection [86.0580214485104]
We propose a general and efficient pre-training paradigm, Montage pre-training, for object detection.
Montage pre-training needs only the target detection dataset while taking only 1/4 computational resources compared to the widely adopted ImageNet pre-training.
The efficiency and effectiveness of Montage pre-training are validated by extensive experiments on the MS-COCO dataset.
arXiv Detail & Related papers (2020-04-25T16:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.