Performance of Transfer Learning Model vs. Traditional Neural Network in
Low System Resource Environment
- URL: http://arxiv.org/abs/2011.07962v1
- Date: Tue, 20 Oct 2020 08:12:56 GMT
- Title: Performance of Transfer Learning Model vs. Traditional Neural Network in
Low System Resource Environment
- Authors: William Hui
- Abstract summary: We will compare the performance and cost between lighter transfer learning model and purposely built neural network for NLP application of text classification and NER model.
The rise of state-of-the-art models such as BERT, XLNet and GPT boost accuracy and benefit as a base model for transfer leanring.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the use of pre-trained model to build neural network based on
transfer learning methodology is increasingly popular. These pre-trained models
present the benefit of using less computing resources to train model with
smaller amount of training data. The rise of state-of-the-art models such as
BERT, XLNet and GPT boost accuracy and benefit as a base model for transfer
leanring. However, these models are still too complex and consume many
computing resource to train for transfer learning with low GPU memory. We will
compare the performance and cost between lighter transfer learning model and
purposely built neural network for NLP application of text classification and
NER model.
Related papers
- BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Initializing Models with Larger Ones [76.41561758293055]
We introduce weight selection, a method for initializing smaller models by selecting a subset of weights from a pretrained larger model.
Our experiments demonstrate that weight selection can significantly enhance the performance of small models and reduce their training time.
arXiv Detail & Related papers (2023-11-30T18:58:26Z) - Low-Resource Music Genre Classification with Cross-Modal Neural Model
Reprogramming [129.4950757742912]
We introduce a novel method for leveraging pre-trained models for low-resource (music) classification based on the concept of Neural Model Reprogramming (NMR)
NMR aims at re-purposing a pre-trained model from a source domain to a target domain by modifying the input of a frozen pre-trained model.
Experimental results suggest that a neural model pre-trained on large-scale datasets can successfully perform music genre classification by using this reprogramming method.
arXiv Detail & Related papers (2022-11-02T17:38:33Z) - Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms.
We show that pre-trained models and fine-tuned models have significantly high similarities in weight values.
We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Canoe : A System for Collaborative Learning for Neural Nets [4.547883122787855]
Canoe is a framework that facilitates knowledge transfer for neural networks.
Canoe provides new system support for dynamically extracting significant parameters from a helper node's neural network.
The evaluation of Canoe with different PyTorch and neural network models demonstrates that the knowledge transfer mechanism improves the model's adaptiveness to 3.5X compared to learning in isolation.
arXiv Detail & Related papers (2021-08-27T05:30:15Z) - Training Deep Neural Networks with Constrained Learning Parameters [4.917317902787792]
A significant portion of deep learning tasks would run on edge computing systems.
We propose the Combinatorial Neural Network Training Algorithm (CoNNTrA)
CoNNTrA trains deep learning models with ternary learning parameters on the MNIST, Iris and ImageNet data sets.
Our results indicate that CoNNTrA models use 32x less memory and have errors at par with the Backpropagation models.
arXiv Detail & Related papers (2020-09-01T16:20:11Z) - From Boltzmann Machines to Neural Networks and Back Again [31.613544605376624]
We give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models.
Our results are based on new connections to learning two-layer neural networks under $ell_infty$ bounded input.
We then give an algorithm for learning a natural class of supervised RBMs with better runtime than what is possible for its related class of networks without distributional assumptions.
arXiv Detail & Related papers (2020-07-25T00:42:50Z) - Minimax Lower Bounds for Transfer Learning with Linear and One-hidden
Layer Neural Networks [27.44348371795822]
We develop a statistical minimax framework to characterize the limits of transfer learning.
We derive a lower-bound for the target generalization error achievable by any algorithm as a function of the number of labeled source and target data.
arXiv Detail & Related papers (2020-06-16T22:49:26Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.