P2W: From Power Traces to Weights Matrix -- An Unconventional Transfer Learning Approach
- URL: http://arxiv.org/abs/2502.14968v1
- Date: Thu, 20 Feb 2025 19:05:28 GMT
- Title: P2W: From Power Traces to Weights Matrix -- An Unconventional Transfer Learning Approach
- Authors: Roozbeh Siyadatzadeh, Fatemeh Mehrafrooz, Nele Mentens, Todor Stefanov,
- Abstract summary: The rapid growth of deploying machine learning (ML) models within embedded systems on a chip (SoCs) has led to transformative shifts in fields like healthcare and autonomous vehicles.<n>One of the primary challenges for training such embedded ML models is the lack of publicly available high-quality training data.<n>We introduce a novel unconventional transfer learning approach to train a new ML model by extracting and using weights from an existing ML model.
- Score: 1.1383507019490222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid growth of deploying machine learning (ML) models within embedded systems on a chip (SoCs) has led to transformative shifts in fields like healthcare and autonomous vehicles. One of the primary challenges for training such embedded ML models is the lack of publicly available high-quality training data. Transfer learning approaches address this challenge by utilizing the knowledge encapsulated in an existing ML model as a starting point for training a new ML model. However, existing transfer learning approaches require direct access to the existing model which is not always feasible, especially for ML models deployed on embedded SoCs. Therefore, in this paper, we introduce a novel unconventional transfer learning approach to train a new ML model by extracting and using weights from an existing ML model running on an embedded SoC without having access to the model within the SoC. Our approach captures power consumption measurements from the SoC while it is executing the ML model and translates them to an approximated weights matrix used to initialize the new ML model. This improves the learning efficiency and predictive performance of the new model, especially in scenarios with limited data available to train the model. Our novel approach can effectively increase the accuracy of the new ML model up to 3 times compared to classical training methods using the same amount of limited training data.
Related papers
- LoQT: Low-Rank Adapters for Quantized Pretraining [5.767156832161818]
Low-Rank Adapters for Quantized Training (LoQT) is a method for efficiently training quantized models.
Our approach is suitable for both pretraining and fine-tuning models.
We demonstrate this for language modeling and downstream task adaptation, finding that LoQT enables efficient training of models up to 7B parameters on a 24GB GPU.
arXiv Detail & Related papers (2024-05-26T11:29:57Z) - Observational Scaling Laws and the Predictability of Language Model Performance [51.2336010244645]
We propose an observational approach that bypasses model training and instead builds scaling laws from 100 publically available models.
We show that several emergent phenomena follow a smooth, sigmoidal behavior and are predictable from small models.
We show how to predict the impact of post-training interventions like Chain-of-Thought and Self-Consistency as language model capabilities continue to improve.
arXiv Detail & Related papers (2024-05-17T17:49:44Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Initializing Models with Larger Ones [76.41561758293055]
We introduce weight selection, a method for initializing smaller models by selecting a subset of weights from a pretrained larger model.
Our experiments demonstrate that weight selection can significantly enhance the performance of small models and reduce their training time.
arXiv Detail & Related papers (2023-11-30T18:58:26Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [65.57123249246358]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.<n>On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.<n>On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Challenges and Opportunities of Using Transformer-Based Multi-Task
Learning in NLP Through ML Lifecycle: A Survey [0.6240603866868214]
Multi-Task Learning (MTL) has emerged as a promising approach to improve efficiency and performance through joint training.
We discuss the challenges and opportunities of using MTL approaches throughout typical machine learning lifecycle phases.
We believe it would be practical to have a model that can handle both MTL and continual learning.
arXiv Detail & Related papers (2023-08-16T09:11:00Z) - Towards Foundation Models for Scientific Machine Learning:
Characterizing Scaling and Transfer Behavior [32.74388989649232]
We study how pre-training could be used for scientific machine learning (SciML) applications.
We find that fine-tuning these models yields more performance gains as model size increases.
arXiv Detail & Related papers (2023-06-01T00:32:59Z) - Deep Learning model integrity checking mechanism using watermarking technique [0.0]
We propose a model integrity-checking mechanism that uses model watermarking techniques to monitor the integrity of ML models.
Our proposed technique can monitor the integrity of ML models even when the model is further trained on newer data with a low computational cost.
arXiv Detail & Related papers (2023-01-29T03:05:53Z) - Mini-Model Adaptation: Efficiently Extending Pretrained Models to New
Languages via Aligned Shallow Training [36.5936227129021]
It is possible to expand pretrained Masked Language Models to new languages by learning a new set of embeddings, while keeping the transformer body frozen.
We propose mini-model adaptation, a compute-efficient alternative that builds a shallow mini-model from a fraction of a large model's parameters.
New language-specific embeddings can then be efficiently trained over the mini-model and plugged into the aligned large model for rapid cross-lingual transfer.
arXiv Detail & Related papers (2022-12-20T18:17:28Z) - Towards Sustainable Self-supervised Learning [193.78876000005366]
We propose a Target-Enhanced Conditional (TEC) scheme which introduces two components to the existing mask-reconstruction based SSL.
First, we propose patch-relation enhanced targets which enhances the target given by base model and encourages the new model to learn semantic-relation knowledge from the base model.
Secondly, we introduce a conditional adapter that adaptively adjusts new model prediction to align with the target of different base models.
arXiv Detail & Related papers (2022-10-20T04:49:56Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.