Related papers: From drift to adaptation to the failed ml model: Transfer Learning in Industrial MLOps

From drift to adaptation to the failed ml model: Transfer Learning in Industrial MLOps

URL: http://arxiv.org/abs/2602.00957v1
Date: Sun, 01 Feb 2026 01:32:26 GMT
Title: From drift to adaptation to the failed ml model: Transfer Learning in Industrial MLOps
Authors: Waqar Muhammad Ashraf, Talha Ansar, Fahad Ahmed, Jawad Hussain, Muhammad Mujtaba Abbas, Vivek Dua,
Abstract summary: The flue gas differential pressure across the air preheater unit installed in a 660 MW thermal power plant is analyzed as a case study.<n> Updating the failed ANN model by three transfer learning techniques reveals that provides relatively higher predictive accuracy for the batch size of 5 days than those of LLTL and ALTL.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model adaptation to production environment is critical for reliable Machine Learning Operations (MLOps), less attention is paid to developing systematic framework for updating the ML models when they fail under data drift. This paper compares the transfer learning enabled model update strategies including ensemble transfer learning (ETL), all-layers transfer learning (ALTL), and last-layer transfer learning (LLTL) for updating the failed feedforward artificial neural network (ANN) model. The flue gas differential pressure across the air preheater unit installed in a 660 MW thermal power plant is analyzed as a case study since it mimics the batch processes due to load cycling in the power plant. Updating the failed ANN model by three transfer learning techniques reveals that ETL provides relatively higher predictive accuracy for the batch size of 5 days than those of LLTL and ALTL. However, ALTL is found to be suitable for effective update of the model trained on large batch size (8 days). A mixed trend is observed for computational requirement (hyperparameter tuning and model training) of model update techniques for different batch sizes. These fundamental and empiric insights obtained from the batch process-based industrial case study can assist the MLOps practitioners in adapting the failed models to data drifts for the accurate monitoring of industrial processes.

Related papers

A Multi-Criteria Automated MLOps Pipeline for Cost-Effective Cloud-Based Classifier Retraining in Response to Data Distribution Shifts [0.0]
The performance of machine learning (ML) models often deteriorates when the underlying data distribution changes over time.<n> ML Operations (MLOps) is often manual, i.e., humans trigger the process of model retraining and redeployment.<n>We present an automated MLOps pipeline designed to address neural network retraining in response to significant data distribution changes.
arXiv Detail & Related papers (2025-12-12T13:22:14Z)
Adapting to Change: A Comparison of Continual and Transfer Learning for Modeling Building Thermal Dynamics under Concept Drifts [0.0]
Transfer Learning is the most effective approach for modeling building thermal dynamics when only limited data are available.<n>It remains unclear how to proceed after initial fine-tuning, as more operational measurement data are collected over time.<n>This study compares several CL and TL strategies, as well as a model trained from scratch, for thermal dynamics modeling during building operation.
arXiv Detail & Related papers (2025-08-29T13:29:54Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them. This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model. OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress.<n>LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset.<n>Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Transfer Learning in Deep Learning Models for Building Load Forecasting: Case of Limited Data [0.0]
This paper proposes a Building-to-Building Transfer Learning framework to overcome the problem and enhance the performance of Deep Learning models. The proposed approach improved the forecasting accuracy by 56.8% compared to the case of conventional deep learning where training from scratch is used.
arXiv Detail & Related papers (2023-01-25T16:05:47Z)
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation [91.05073136215886]
"Actor-Learner Distillation" transfers learning progress from a large capacity learner model to a small capacity actor model. We demonstrate in several challenging memory environments that using Actor-Learner Distillation recovers the clear sample-efficiency gains of the transformer learner model.
arXiv Detail & Related papers (2021-04-04T17:56:34Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.