Lambda Learner: Fast Incremental Learning on Data Streams
- URL: http://arxiv.org/abs/2010.05154v2
- Date: Mon, 28 Jun 2021 14:27:01 GMT
- Title: Lambda Learner: Fast Incremental Learning on Data Streams
- Authors: Rohan Ramanath, Konstantin Salomatin, Jeffrey D. Gee, Kirill Talanine,
Onkar Dalal, Gungor Polatkan, Sara Smoot, Deepak Kumar
- Abstract summary: We propose a new framework for training models by incremental updates in response to mini-batches from data streams.
We show that the resulting model of our framework closely estimates a periodically updated model trained on offline data and outperforms it when model updates are time-sensitive.
We present a large-scale deployment on the sponsored content platform for a large social network.
- Score: 5.543723668681475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most well-established applications of machine learning is in
deciding what content to show website visitors. When observation data comes
from high-velocity, user-generated data streams, machine learning methods
perform a balancing act between model complexity, training time, and
computational costs. Furthermore, when model freshness is critical, the
training of models becomes time-constrained. Parallelized batch offline
training, although horizontally scalable, is often not time-considerate or
cost-effective. In this paper, we propose Lambda Learner, a new framework for
training models by incremental updates in response to mini-batches from data
streams. We show that the resulting model of our framework closely estimates a
periodically updated model trained on offline data and outperforms it when
model updates are time-sensitive. We provide theoretical proof that the
incremental learning updates improve the loss-function over a stale batch
model. We present a large-scale deployment on the sponsored content platform
for a large social network, serving hundreds of millions of users across
different channels (e.g., desktop, mobile). We address challenges and
complexities from both algorithms and infrastructure perspectives, and
illustrate the system details for computation, storage, and streaming
production of training data.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Large Language Model (LLM) pretraining traditionally relies on autoregressive language modeling on randomly sampled data blocks from web-scale datasets.
We take inspiration from human learning techniques like spaced repetition to hypothesize that random data sampling for LLMs leads to high training cost and low quality models which tend to forget data.
In order to effectively commit web-scale information to long-term memory, we propose the LFR (Learn, Focus, and Review) pedagogy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z) - EdgeSync: Faster Edge-model Updating via Adaptive Continuous Learning for Video Data Drift [7.165359653719119]
Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency.
The distribution of video content features may change over time, leading to accuracy degradation of existing models.
Recent work proposes a framework that uses a remote server to continually train and adapt the lightweight model at edge with the help of complex model.
arXiv Detail & Related papers (2024-06-05T07:06:26Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - On the Costs and Benefits of Adopting Lifelong Learning for Software
Analytics -- Empirical Study on Brown Build and Risk Prediction [17.502553991799832]
This paper evaluates the use of lifelong learning (LL) for industrial use cases at Ubisoft.
LL is used to continuously build and maintain ML-based software analytics tools using an incremental learner that progressively updates the old model using new data.
arXiv Detail & Related papers (2023-05-16T21:57:16Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past.
We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning.
We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Incremental Learning for Personalized Recommender Systems [8.020546404087922]
We present an incremental learning solution to provide both the training efficiency and the model quality.
The solution is deployed in LinkedIn and directly applicable to industrial scale recommender systems.
arXiv Detail & Related papers (2021-08-13T04:21:21Z) - A Practical Incremental Method to Train Deep CTR Models [37.54660958085938]
We introduce a practical incremental method to train deep CTR models, which consists of three decoupled modules.
Our method can achieve comparable performance to the conventional batch mode training with much better training efficiency.
arXiv Detail & Related papers (2020-09-04T12:35:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.