Extreme Learning Machines for Fast Training of Click-Through Rate Prediction Models
- URL: http://arxiv.org/abs/2406.17828v1
- Date: Tue, 25 Jun 2024 13:50:00 GMT
- Title: Extreme Learning Machines for Fast Training of Click-Through Rate Prediction Models
- Authors: Ergun Biçici,
- Abstract summary: Extreme Learning Machines (ELM) provide a fast alternative to traditional gradient-based learning in neural networks.
We explore the application of ELMs for the task of Click-Through Rate (CTR) prediction.
We introduce an ELM-based model enhanced with embedding layers to improve the performance on CTR tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extreme Learning Machines (ELM) provide a fast alternative to traditional gradient-based learning in neural networks, offering rapid training and robust generalization capabilities. Its theoretical basis shows its universal approximation capability. We explore the application of ELMs for the task of Click-Through Rate (CTR) prediction, which is largely unexplored by ELMs due to the high dimensionality of the problem. We introduce an ELM-based model enhanced with embedding layers to improve the performance on CTR tasks, which is a novel addition to the field. Experimental results on benchmark datasets, including Avazu and Criteo, demonstrate that our proposed ELM with embeddings achieves competitive F1 results while significantly reducing training time compared to state-of-the-art models such as Masknet. Our findings show that ELMs can be useful for CTR prediction, especially when fast training is needed.
Related papers
- Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning [37.745896674964186]
Multi-task learning (MTL) aims to improve the generalization performance of a model on multiple related tasks by training it simultaneously on those tasks.
Continual learning (CL) involves adapting to new sequentially arriving tasks over time without forgetting the previously acquired knowledge.
We develop theoretical results describing the effect of various system parameters on the model's performance in an MTL setup.
Our results reveal the impact of buffer size and model capacity on the forgetting rate in a CL setup and help shed light on some of the state-of-the-art CL methods.
arXiv Detail & Related papers (2024-08-29T23:22:40Z) - Theory on Mixture-of-Experts in Continual Learning [72.42497633220547]
Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time.
Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks.
MoE model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network.
arXiv Detail & Related papers (2024-06-24T08:29:58Z) - SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts [49.01990048827639]
We introduce SEER-MoE, a framework for reducing both the memory footprint and compute requirements of pre-trained MoE models.
The first stage involves pruning the total number of experts using a heavy-hitters counting guidance, while the second stage employs a regularization-based fine-tuning strategy to recover accuracy loss.
Our empirical studies demonstrate the effectiveness of our method, resulting in a sparse MoEs model optimized for inference efficiency with minimal accuracy trade-offs.
arXiv Detail & Related papers (2024-04-07T22:13:43Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - Fast Cerebral Blood Flow Analysis via Extreme Learning Machine [4.373558495838564]
We introduce a rapid and precise analytical approach for analyzing cerebral blood flow (CBF) using Diffuse Correlation spectroscopy (DCS)
We assess existing algorithms using synthetic datasets for both semi-infinite and multi-layer models.
Results demonstrate that ELM consistently achieves higher fidelity across various noise levels and optical parameters, showcasing robust generalization ability and outperforming iterative fitting algorithms.
arXiv Detail & Related papers (2024-01-10T23:01:35Z) - Low-Frequency Load Identification using CNN-BiLSTM Attention Mechanism [0.0]
Non-intrusive Load Monitoring (NILM) is an established technique for effective and cost-efficient electricity consumption management.
This paper presents a hybrid learning approach, consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory (BILSTM)
CNN-BILSTM model is adept at extracting both temporal (time-related) and spatial (location-related) features, allowing it to precisely identify energy consumption patterns at the appliance level.
arXiv Detail & Related papers (2023-11-14T21:02:27Z) - Pre-training Language Model as a Multi-perspective Course Learner [103.17674402415582]
This study proposes a multi-perspective course learning (MCL) method for sample-efficient pre-training.
In this study, three self-supervision courses are designed to alleviate inherent flaws of "tug-of-war" dynamics.
Our method significantly improves ELECTRA's average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks.
arXiv Detail & Related papers (2023-05-06T09:02:10Z) - Improving Rare Word Recognition with LM-aware MWER Training [50.241159623691885]
We introduce LMs in the learning of hybrid autoregressive transducer (HAT) models in the discriminative training framework.
For the shallow fusion setup, we use LMs during both hypotheses generation and loss computation, and the LM-aware MWER-trained model achieves 10% relative improvement.
For the rescoring setup, we learn a small neural module to generate per-token fusion weights in a data-dependent manner.
arXiv Detail & Related papers (2022-04-15T17:19:41Z) - Fast fluorescence lifetime imaging analysis via extreme learning machine [7.7721777809498676]
We present a fast and accurate analytical method for fluorescence lifetime imaging microscopy (FLIM) using the extreme learning machine (ELM)
Results indicate that ELM can obtain higher fidelity, even in low-photon conditions.
arXiv Detail & Related papers (2022-03-25T16:34:51Z) - Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model.
Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z) - A Hybrid Residual Dilated LSTM end Exponential Smoothing Model for
Mid-Term Electric Load Forecasting [1.1602089225841632]
The model combines exponential smoothing (ETS), advanced Long Short-Term Memory (LSTM) and ensembling.
A simulation study performed on the monthly electricity demand time series for 35 European countries confirmed the high performance of the proposed model.
arXiv Detail & Related papers (2020-03-29T10:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.