Data-driven Approaches to Surrogate Machine Learning Model Development
- URL: http://arxiv.org/abs/2210.02631v1
- Date: Thu, 6 Oct 2022 01:30:11 GMT
- Title: Data-driven Approaches to Surrogate Machine Learning Model Development
- Authors: H. Rhys Jones, Tingting Mu and Andrei C. Popescu
- Abstract summary: We demonstrate the adaption of three established methods to the field of surrogate machine learning model development.
These methods are data augmentation, custom loss functions and transfer learning.
We see a significant improvement in model performance through the combination of all three techniques.
- Score: 3.2734466030053175
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We demonstrate the adaption of three established methods to the field of
surrogate machine learning model development. These methods are data
augmentation, custom loss functions and transfer learning. Each of these
methods have seen widespread use in the field of machine learning, however,
here we apply them specifically to surrogate machine learning model
development. The machine learning model that forms the basis behind this work
was intended to surrogate a traditional engineering model used in the UK
nuclear industry. Previous performance of this model has been hampered by poor
performance due to limited training data. Here, we demonstrate that through a
combination of additional techniques, model performance can be significantly
improved. We show that each of the aforementioned techniques have utility in
their own right and in combination with one another. However, we see them best
applied as part of a transfer learning operation. Five pre-trained surrogate
models produced prior to this research were further trained with an augmented
dataset and with our custom loss function. Through the combination of all three
techniques, we see a significant improvement in model performance.
Related papers
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Machine Unlearning in Contrastive Learning [3.6218162133579694]
We introduce a novel gradient constraint-based approach for training the model to effectively achieve machine unlearning.
Our approach demonstrates proficient performance not only on contrastive learning models but also on supervised learning models.
arXiv Detail & Related papers (2024-05-12T16:09:01Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Cheap Learning: Maximising Performance of Language Models for Social
Data Science Using Minimal Data [1.8692054990918079]
We review three cheap' techniques that have developed in recent years: weak supervision, transfer learning and prompt engineering.
For the latter, we review the particular case of zero-shot prompting of large language models.
We show good performance for all techniques, and in particular we demonstrate how prompting of large language models can achieve high accuracy at very low cost.
arXiv Detail & Related papers (2024-01-22T19:00:11Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Machine Unlearning Methodology base on Stochastic Teacher Network [33.763901254862766]
"Right to be forgotten" grants data owners the right to actively withdraw data that has been used for model training.
Existing machine unlearning methods have been found to be ineffective in quickly removing knowledge from deep learning models.
This paper proposes using a network as a teacher to expedite the mitigation of the influence caused by forgotten data on the model.
arXiv Detail & Related papers (2023-08-28T06:05:23Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.