DIET: Customized Slimming for Incompatible Networks in Sequential Recommendation
- URL: http://arxiv.org/abs/2406.08804v2
- Date: Sat, 15 Jun 2024 12:57:25 GMT
- Title: DIET: Customized Slimming for Incompatible Networks in Sequential Recommendation
- Authors: Kairui Fu, Shengyu Zhang, Zheqi Lv, Jingyuan Chen, Jiwei Li,
- Abstract summary: recommender systems start to deploy models on edges to alleviate network congestion caused by frequent mobile requests.
Several studies have leveraged the proximity of edge-side to real-time data, fine-tuning them to create edge-specific models.
These methods require substantial on-edge computational resources and frequent network transfers to keep the model up to date.
We propose a customizeD slImming framework for incompatiblE neTworks(DIET). DIET deploys the same generic backbone (potentially incompatible for a specific edge) to all devices.
- Score: 16.44627200990594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the continuously improving capabilities of mobile edges, recommender systems start to deploy models on edges to alleviate network congestion caused by frequent mobile requests. Several studies have leveraged the proximity of edge-side to real-time data, fine-tuning them to create edge-specific models. Despite their significant progress, these methods require substantial on-edge computational resources and frequent network transfers to keep the model up to date. The former may disrupt other processes on the edge to acquire computational resources, while the latter consumes network bandwidth, leading to a decrease in user satisfaction. In response to these challenges, we propose a customizeD slImming framework for incompatiblE neTworks(DIET). DIET deploys the same generic backbone (potentially incompatible for a specific edge) to all devices. To minimize frequent bandwidth usage and storage consumption in personalization, DIET tailors specific subnets for each edge based on its past interactions, learning to generate slimming subnets(diets) within incompatible networks for efficient transfer. It also takes the inter-layer relationships into account, empirically reducing inference time while obtaining more suitable diets. We further explore the repeated modules within networks and propose a more storage-efficient framework, DIETING, which utilizes a single layer of parameters to represent the entire network, achieving comparably excellent performance. The experiments across four state-of-the-art datasets and two widely used models demonstrate the superior accuracy in recommendation and efficiency in transmission and storage of our framework.
Related papers
- Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - Edge-MultiAI: Multi-Tenancy of Latency-Sensitive Deep Learning
Applications on Edge [10.067877168224337]
This research aims to overcome the memory contention challenge to meet the latency constraints of the Deep Learning applications.
We propose an efficient NN model management framework, called Edge-MultiAI, that ushers the NN models of the DL applications into the edge memory.
We show that Edge-MultiAI can stimulate the degree of multi-tenancy on the edge by at least 2X and increase the number of warm-starts by around 60% without any major loss on the inference accuracy of the applications.
arXiv Detail & Related papers (2022-11-14T06:17:32Z) - Communication-Efficient Separable Neural Network for Distributed
Inference on Edge Devices [2.28438857884398]
We propose a novel method of exploiting model parallelism to separate a neural network for distributed inferences.
Under proper specifications of devices and configurations of models, our experiments show that the inference of large neural networks on edge clusters can be distributed and accelerated.
arXiv Detail & Related papers (2021-11-03T19:30:28Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - DynaComm: Accelerating Distributed CNN Training between Edges and Clouds
through Dynamic Communication Scheduling [11.34309642431225]
We present DynaComm, a novel scheduler that decomposes each transmission procedure into several segments to achieve optimal communications and computations overlapping during run-time.
We verify that DynaComm manages to achieve optimal scheduling for all cases compared to competing strategies while the model accuracy remains untouched.
arXiv Detail & Related papers (2021-01-20T05:09:41Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - LSTM Networks for Online Cross-Network Recommendations [33.17802459749589]
Cross-network recommender systems use auxiliary information from multiple source networks to create holistic user profiles and improve recommendations in a target network.
We find two major limitations in existing cross-network solutions that reduce overall recommender performance.
We propose a novel multi-layered Long Short-Term Memory (LSTM) network based online solution to mitigate these issues.
arXiv Detail & Related papers (2020-08-25T07:10:24Z) - Now that I can see, I can improve: Enabling data-driven finetuning of
CNNs on the edge [11.789983276366987]
This paper provides a first step towards enabling CNN finetuning on an edge device based on structured pruning.
It explores the performance gains and costs of doing so and presents an open-source framework that allows the deployment of such approaches.
arXiv Detail & Related papers (2020-06-15T17:16:45Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.