On-Device Learning with Cloud-Coordinated Data Augmentation for Extreme
Model Personalization in Recommender Systems
- URL: http://arxiv.org/abs/2201.10382v1
- Date: Mon, 24 Jan 2022 04:59:04 GMT
- Title: On-Device Learning with Cloud-Coordinated Data Augmentation for Extreme
Model Personalization in Recommender Systems
- Authors: Renjie Gu, Chaoyue Niu, Yikai Yan, Fan Wu, Shaojie Tang, Rongfeng Jia,
Chengfei Lyu, Guihai Chen
- Abstract summary: We propose a new device-cloud collaborative learning framework, called CoDA, to break the dilemmas of purely cloud-based learning and on-device learning.
CoDA retrieves similar samples from the cloud's global pool to augment each user's local dataset to train the recommendation model.
Online A/B testing results show the remarkable performance improvement of CoDA over both cloud-based learning without model personalization and on-device training without data augmentation.
- Score: 39.41506296601779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data heterogeneity is an intrinsic property of recommender systems, making
models trained over the global data on the cloud, which is the mainstream in
industry, non-optimal to each individual user's local data distribution. To
deal with data heterogeneity, model personalization with on-device learning is
a potential solution. However, on-device training using a user's small size of
local samples will incur severe overfitting and undermine the model's
generalization ability. In this work, we propose a new device-cloud
collaborative learning framework, called CoDA, to break the dilemmas of purely
cloud-based learning and on-device learning. The key principle of CoDA is to
retrieve similar samples from the cloud's global pool to augment each user's
local dataset to train the recommendation model. Specifically, after a
coarse-grained sample matching on the cloud, a personalized sample classifier
is further trained on each device for a fine-grained sample filtering, which
can learn the boundary between the local data distribution and the outside data
distribution. We also build an end-to-end pipeline to support the flows of
data, model, computation, and control between the cloud and each device. We
have deployed CoDA in a recommendation scenario of Mobile Taobao. Online A/B
testing results show the remarkable performance improvement of CoDA over both
cloud-based learning without model personalization and on-device training
without data augmentation. Overhead testing on a real device demonstrates the
computation, storage, and communication efficiency of the on-device tasks in
CoDA.
Related papers
- Enhancing Federated Learning Convergence with Dynamic Data Queue and Data Entropy-driven Participant Selection [13.825031686864559]
Federated Learning (FL) is a decentralized approach for collaborative model training on edge devices.
We present a method to improve convergence in FL by creating a global subset of data on the server and dynamically distributing it across devices.
Our approach results in a substantial accuracy boost of approximately 5% for the MNIST dataset, around 18% for CIFAR-10, and 20% for CIFAR-100 with a 10% global subset of data, outperforming the state-of-the-art (SOTA) aggregation algorithms.
arXiv Detail & Related papers (2024-10-23T11:47:04Z) - Efficient Data Distribution Estimation for Accelerated Federated Learning [5.085889377571319]
Federated Learning(FL) is a privacy-preserving machine learning paradigm where a global model is trained in-situ across a large number of distributed edge devices.
Devices are highly heterogeneous in both their system resources and training data.
Various client selection algorithms have been developed, showing promising performance improvement in terms of model coverage and accuracy.
arXiv Detail & Related papers (2024-06-03T20:33:17Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Cloud-Device Collaborative Adaptation to Continual Changing Environments
in the Real-world [20.547119604004774]
We propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device.
We also propose an Uncertainty-based Visual Prompt Adapted (U-VPA) teacher-student model to transfer the generalization capability of the large model on the cloud to the device model.
Our proposed U-VPA teacher-student framework outperforms previous state-of-the-art test time adaptation and device-cloud collaboration methods.
arXiv Detail & Related papers (2022-12-02T05:02:36Z) - Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices.
The impact of on-device storage on the performance of FL is still not explored.
In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z) - Decentralized federated learning of deep neural networks on non-iid data [0.6335848702857039]
We tackle the non-problem of learning a personalized deep learning model in a decentralized setting.
We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data detect each other and cooperate.
PENS is able to achieve higher accuracies as compared to strong baselines.
arXiv Detail & Related papers (2021-07-18T19:05:44Z) - Federated Learning with Downlink Device Selection [92.14944020945846]
We study federated edge learning, where a global model is trained collaboratively using privacy-sensitive data at the edge of a wireless network.
A parameter server (PS) keeps track of the global model and shares it with the wireless edge devices for training using their private local data.
We consider device selection based on downlink channels over which the PS shares the global model with the devices.
arXiv Detail & Related papers (2021-07-07T22:42:39Z) - Federated Visual Classification with Real-World Data Distribution [9.564468846277366]
We characterize the effect real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm.
We introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits.
We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
arXiv Detail & Related papers (2020-03-18T07:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.