Network Augmentation for Tiny Deep Learning
- URL: http://arxiv.org/abs/2110.08890v1
- Date: Sun, 17 Oct 2021 18:48:41 GMT
- Title: Network Augmentation for Tiny Deep Learning
- Authors: Han Cai, Chuang Gan, Ji Lin, Song Han
- Abstract summary: We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.
We demonstrate the effectiveness of NetAug on image classification and object detection.
- Score: 73.57192520534585
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce Network Augmentation (NetAug), a new training method for
improving the performance of tiny neural networks. Existing regularization
techniques (e.g., data augmentation, dropout) have shown much success on large
neural networks (e.g., ResNet50) by adding noise to overcome over-fitting.
However, we found these techniques hurt the performance of tiny neural
networks. We argue that training tiny models are different from large models:
rather than augmenting the data, we should augment the model, since tiny models
tend to suffer from under-fitting rather than over-fitting due to limited
capacity. To alleviate this issue, NetAug augments the network (reverse
dropout) instead of inserting noise into the dataset or the network. It puts
the tiny model into larger models and encourages it to work as a sub-model of
larger models to get extra supervision, in addition to functioning as an
independent model. At test time, only the tiny model is used for inference,
incurring zero inference overhead. We demonstrate the effectiveness of NetAug
on image classification and object detection. NetAug consistently improves the
performance of tiny models, achieving up to 2.1% accuracy improvement on
ImageNet, and 4.3% on Cars. On Pascal VOC, NetAug provides 2.96% mAP
improvement with the same computational cost.
Related papers
- A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Small-footprint slimmable networks for keyword spotting [3.0825815617887415]
We show that slimmable neural networks allow us to create super-nets from Convolutioanl Neural Networks and Transformers.
We demonstrate the usefulness of these models on in-house Alexa data and Google Speech Commands, and focus our efforts on models for the on-device use case.
arXiv Detail & Related papers (2023-04-21T12:59:37Z) - Establishing a stronger baseline for lightweight contrastive models [10.63129923292905]
Recent research has reported a performance degradation in self-supervised contrastive learning for specially designed efficient networks.
A common practice is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher.
In this work, we aim to establish a stronger baseline for lightweight contrastive models without using a pretrained teacher model.
arXiv Detail & Related papers (2022-12-14T11:20:24Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - T-RECX: Tiny-Resource Efficient Convolutional neural networks with
early-eXit [0.0]
We show how an early exit intermediate classifier can be enhanced by the addition of an early exit intermediate classifier.
Our technique is optimized specifically for tiny-CNN sized models.
Our results show that T-RecX 1) improves the accuracy of baseline network, 2) achieves 31.58% average reduction in FLOPS in exchange for one percent accuracy across all evaluated models.
arXiv Detail & Related papers (2022-07-14T02:05:43Z) - LilNetX: Lightweight Networks with EXtreme Model Compression and
Structured Sparsification [36.651329027209634]
LilNetX is an end-to-end trainable technique for neural networks.
It enables learning models with specified accuracy-rate-computation trade-off.
arXiv Detail & Related papers (2022-04-06T17:59:10Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Learnable Expansion-and-Compression Network for Few-shot
Class-Incremental Learning [87.94561000910707]
We propose a learnable expansion-and-compression network (LEC-Net) to solve catastrophic forgetting and model over-fitting problems.
LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.
Experiments on the CUB/CIFAR-100 datasets show that LEC-Net improves the baseline by 57% while outperforms the state-of-the-art by 56%.
arXiv Detail & Related papers (2021-04-06T04:34:21Z) - Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets [65.28292822614418]
Giant formula for simultaneously enlarging the resolution, depth and width provides us a Rubik's cube for neural networks.
This paper aims to explore the twisting rules for obtaining deep neural networks with minimum model sizes and computational costs.
arXiv Detail & Related papers (2020-10-28T08:49:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.