Revealing Secrets From Pre-trained Models
- URL: http://arxiv.org/abs/2207.09539v1
- Date: Tue, 19 Jul 2022 20:19:03 GMT
- Title: Revealing Secrets From Pre-trained Models
- Authors: Mujahid Al Rafi, Yuan Feng, Hyeran Jeon
- Abstract summary: Transfer-learning has been widely adopted in many emerging deep learning algorithms.
We show that pre-trained models and fine-tuned models have significantly high similarities in weight values.
We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
- Score: 2.0249686991196123
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the growing burden of training deep learning models with large data
sets, transfer-learning has been widely adopted in many emerging deep learning
algorithms. Transformer models such as BERT are the main player in natural
language processing and use transfer-learning as a de facto standard training
method. A few big data companies release pre-trained models that are trained
with a few popular datasets with which end users and researchers fine-tune the
model with their own datasets. Transfer-learning significantly reduces the time
and effort of training models. However, it comes at the cost of security
concerns. In this paper, we show a new observation that pre-trained models and
fine-tuned models have significantly high similarities in weight values. Also,
we demonstrate that there exist vendor-specific computing patterns even for the
same models. With these new findings, we propose a new model extraction attack
that reveals the model architecture and the pre-trained model used by the
black-box victim model with vendor-specific computing patterns and then
estimates the entire model weights based on the weight value similarities
between the fine-tuned model and pre-trained model. We also show that the
weight similarity can be leveraged for increasing the model extraction
feasibility through a novel weight extraction pruning.
Related papers
- Comparison of self-supervised in-domain and supervised out-domain transfer learning for bird species recognition [0.19183348587701113]
Transferring the weights of a pre-trained model to assist another task has become a crucial part of modern deep learning.
Our experiments will demonstrate the usefulness of in-domain models and datasets for bird species recognition.
arXiv Detail & Related papers (2024-04-26T08:47:28Z) - Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data [40.7542543934205]
We propose a "wrapper box'' pipeline: training a neural model as usual and then using its learned feature representation in classic, interpretable models to perform prediction.
Across seven language models of varying sizes, we first show that the predictive performance of wrapper classic models is largely comparable to the original neural models.
Our pipeline thus preserves the predictive performance of neural language models while faithfully attributing classic model decisions to training data.
arXiv Detail & Related papers (2023-11-15T01:50:53Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Reusing Pretrained Models by Multi-linear Operators for Efficient
Training [65.64075958382034]
Training large models from scratch usually costs a substantial amount of resources.
Recent studies such as bert2BERT and LiGO have reused small pretrained models to initialize a large model.
We propose a method that linearly correlates each weight of the target model to all the weights of the pretrained model.
arXiv Detail & Related papers (2023-10-16T06:16:47Z) - TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Transfer training from smaller language model [6.982133308738434]
We find a method to save training time and resource cost by changing the small well-trained model to large model.
We test the target model on several data sets and find it is still comparable with the source model.
arXiv Detail & Related papers (2021-04-23T02:56:02Z) - Learning to Reweight with Deep Interactions [104.68509759134878]
We propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model.
Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.
arXiv Detail & Related papers (2020-07-09T09:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.