FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction
- URL: http://arxiv.org/abs/2407.18472v1
- Date: Fri, 26 Jul 2024 02:48:32 GMT
- Title: FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction
- Authors: Wentao Ouyang, Rui Dong, Ri Tao, Xiangzheng Liu,
- Abstract summary: Click-through rate (CTR) prediction plays an important role in online advertising platforms.
Due to privacy concerns, data from different platforms cannot be uploaded to a server for centralized model training.
We propose FedUD, which is able to exploit unaligned data, in addition to aligned data, for more accurate federated CTR prediction.
- Score: 3.221675775415278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Click-through rate (CTR) prediction plays an important role in online advertising platforms. Most existing methods use data from the advertising platform itself for CTR prediction. As user behaviors also exist on many other platforms, e.g., media platforms, it is beneficial to further exploit such complementary information for better modeling user interest and for improving CTR prediction performance. However, due to privacy concerns, data from different platforms cannot be uploaded to a server for centralized model training. Vertical federated learning (VFL) provides a possible solution which is able to keep the raw data on respective participating parties and learn a collaborative model in a privacy-preserving way. However, traditional VFL methods only utilize aligned data with common keys across parties, which strongly restricts their application scope. In this paper, we propose FedUD, which is able to exploit unaligned data, in addition to aligned data, for more accurate federated CTR prediction. FedUD contains two steps. In the first step, FedUD utilizes aligned data across parties like traditional VFL, but it additionally includes a knowledge distillation module. This module distills useful knowledge from the guest party's high-level representations and guides the learning of a representation transfer network. In the second step, FedUD applies the learned knowledge to enrich the representations of the host party's unaligned data such that both aligned and unaligned data can contribute to federated model training. Experiments on two real-world datasets demonstrate the superior performance of FedUD for federated CTR prediction.
Related papers
- Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models [73.94175015918059]
We propose a dataset-level membership inference method based on Self-Comparison.
Our method does not require access to ground-truth member data or non-member data in identical distribution.
arXiv Detail & Related papers (2024-10-16T23:05:59Z) - Vertical Federated Learning Hybrid Local Pre-training [4.31644387824845]
We propose a novel VFL Hybrid Local Pre-training (VFLHLP) approach for Vertical Federated Learning (VFL)
VFLHLP first pre-trains local networks on the local data of participating parties.
Then it utilizes these pre-trained networks to adjust the sub-model for the labeled party or enhance representation learning for other parties during downstream federated learning on aligned data.
arXiv Detail & Related papers (2024-05-20T08:57:39Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - FairVFL: A Fair Vertical Federated Learning Framework with Contrastive
Adversarial Learning [102.92349569788028]
We propose a fair vertical federated learning framework (FairVFL) to improve the fairness of VFL models.
The core idea of FairVFL is to learn unified and fair representations of samples based on the decentralized feature fields in a privacy-preserving way.
For protecting user privacy, we propose a contrastive adversarial learning method to remove private information from the unified representation in server.
arXiv Detail & Related papers (2022-06-07T11:43:32Z) - Adversarial Representation Sharing: A Quantitative and Secure
Collaborative Learning Framework [3.759936323189418]
We find representation learning has unique advantages in collaborative learning due to the lower communication overhead and task-independency.
We present ARS, a collaborative learning framework wherein users share representations of data to train models.
We demonstrate that our mechanism is effective against model inversion attacks, and achieves a balance between privacy and utility.
arXiv Detail & Related papers (2022-03-27T13:29:15Z) - Game of Privacy: Towards Better Federated Platform Collaboration under
Privacy Restriction [95.12382372267724]
Vertical federated learning (VFL) aims to train models from cross-silo data with different feature spaces stored on different platforms.
Due to the intrinsic privacy risks of federated learning, the total amount of involved data may be constrained.
We propose to incent different platforms through a reciprocal collaboration, where all platforms can exploit multi-platform information in the VFL framework to benefit their own tasks.
arXiv Detail & Related papers (2022-02-10T16:45:40Z) - Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data [93.76907759950608]
We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
arXiv Detail & Related papers (2020-08-14T05:46:56Z) - FedOCR: Communication-Efficient Federated Learning for Scene Text
Recognition [76.26472513160425]
We study how to make use of decentralized datasets for training a robust scene text recognizer.
To make FedOCR fairly suitable to be deployed on end devices, we make two improvements including using lightweight models and hashing techniques.
arXiv Detail & Related papers (2020-07-22T14:30:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.