Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation
- URL: http://arxiv.org/abs/2402.02855v1
- Date: Mon, 5 Feb 2024 10:16:20 GMT
- Title: Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation
- Authors: Shuyao Wang, Yongduo Sui, Jiancan Wu, Zhi Zheng, Hui Xiong
- Abstract summary: This paper introduces a novel learning paradigm, Dynamic Sparse Learning, tailored for recommendation models.
DSL innovatively trains a lightweight sparse model from scratch, periodically evaluating and dynamically adjusting each weight's significance.
Our experimental results underline DSL's effectiveness, significantly reducing training and inference costs while delivering comparable recommendation performance.
- Score: 20.851925464903804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the realm of deep learning-based recommendation systems, the increasing
computational demands, driven by the growing number of users and items, pose a
significant challenge to practical deployment. This challenge is primarily
twofold: reducing the model size while effectively learning user and item
representations for efficient recommendations. Despite considerable
advancements in model compression and architecture search, prevalent approaches
face notable constraints. These include substantial additional computational
costs from pre-training/re-training in model compression and an extensive
search space in architecture design. Additionally, managing complexity and
adhering to memory constraints is problematic, especially in scenarios with
strict time or space limitations. Addressing these issues, this paper
introduces a novel learning paradigm, Dynamic Sparse Learning (DSL), tailored
for recommendation models. DSL innovatively trains a lightweight sparse model
from scratch, periodically evaluating and dynamically adjusting each weight's
significance and the model's sparsity distribution during the training. This
approach ensures a consistent and minimal parameter budget throughout the full
learning lifecycle, paving the way for "end-to-end" efficiency from training to
inference. Our extensive experimental results underline DSL's effectiveness,
significantly reducing training and inference costs while delivering comparable
recommendation performance.
Related papers
- Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - EsaCL: Efficient Continual Learning of Sparse Models [10.227171407348326]
Key challenge in the continual learning setting is to efficiently learn a sequence of tasks without forgetting how to perform previously learned tasks.
We propose a new method for efficient continual learning of sparse models (EsaCL) that can automatically prune redundant parameters without adversely impacting the model's predictive power.
arXiv Detail & Related papers (2024-01-11T04:59:44Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - Incremental Learning for Personalized Recommender Systems [8.020546404087922]
We present an incremental learning solution to provide both the training efficiency and the model quality.
The solution is deployed in LinkedIn and directly applicable to industrial scale recommender systems.
arXiv Detail & Related papers (2021-08-13T04:21:21Z) - Alternate Model Growth and Pruning for Efficient Training of
Recommendation Systems [7.415129876303651]
Model pruning is an effective technique to reduce computation overhead for deep neural networks by removing redundant parameters.
Modern recommendation systems are still thirsty for model capacity due to the demand for handling big data.
We propose a dynamic training scheme, namely alternate model growth and pruning, to alternatively construct and prune weights in the course of training.
arXiv Detail & Related papers (2021-05-04T03:14:30Z) - SpaceNet: Make Free Space For Continual Learning [15.914199054779438]
We propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario.
SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons.
Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model.
arXiv Detail & Related papers (2020-07-15T11:21:31Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.