Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data
- URL: http://arxiv.org/abs/2008.06197v1
- Date: Fri, 14 Aug 2020 05:46:56 GMT
- Title: Federated Doubly Stochastic Kernel Learning for Vertically Partitioned
Data
- Authors: Bin Gu, Zhiyuan Dang, Xiang Li, Heng Huang
- Abstract summary: We propose a doubly kernel learning algorithm for vertically partitioned data.
We show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels.
- Score: 93.76907759950608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a lot of real-world data mining and machine learning applications, data
are provided by multiple providers and each maintains private records of
different feature sets about common entities. It is challenging to train these
vertically partitioned data effectively and efficiently while keeping data
privacy for traditional data mining and machine learning algorithms. In this
paper, we focus on nonlinear learning with kernels, and propose a federated
doubly stochastic kernel learning (FDSKL) algorithm for vertically partitioned
data. Specifically, we use random features to approximate the kernel mapping
function and use doubly stochastic gradients to update the solutions, which are
all computed federatedly without the disclosure of data. Importantly, we prove
that FDSKL has a sublinear convergence rate, and can guarantee the data
security under the semi-honest assumption. Extensive experimental results on a
variety of benchmark datasets show that FDSKL is significantly faster than
state-of-the-art federated learning methods when dealing with kernels, while
retaining the similar generalization performance.
Related papers
- An Empirical Study of Efficiency and Privacy of Federated Learning
Algorithms [2.994794762377111]
In today's world, the rapid expansion of IoT networks and the proliferation of smart devices have resulted in the generation of substantial amounts of heterogeneous data.
To handle this data effectively, advanced data processing technologies are necessary to guarantee the preservation of both privacy and efficiency.
Federated learning emerged as a distributed learning method that trains models locally and aggregates them on a server to preserve data privacy.
arXiv Detail & Related papers (2023-12-24T00:13:41Z) - Factor-Assisted Federated Learning for Personalized Optimization with
Heterogeneous Data [6.024145412139383]
Federated learning is an emerging distributed machine learning framework aiming at protecting data privacy.
Data in different clients contain both common knowledge and personalized knowledge.
We develop a novel personalized federated learning framework for heterogeneous data, which we refer to as FedSplit.
arXiv Detail & Related papers (2023-12-07T13:05:47Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Kernel Continual Learning [117.79080100313722]
kernel continual learning is a simple but effective variant of continual learning to tackle catastrophic forgetting.
episodic memory unit stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression.
variational random features to learn a data-driven kernel for each task.
arXiv Detail & Related papers (2021-07-12T22:09:30Z) - Privacy-Preserving Asynchronous Federated Learning Algorithms for
Multi-Party Vertically Collaborative Learning [151.47900584193025]
We propose an asynchronous federated SGD (AFSGD-VP) algorithm and its SVRG and SAGA variants on the vertically partitioned data.
To the best of our knowledge, AFSGD-VP and its SVRG and SAGA variants are the first asynchronous federated learning algorithms for vertically partitioned data.
arXiv Detail & Related papers (2020-08-14T08:08:15Z) - Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning.
It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers.
Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z) - Federated Visual Classification with Real-World Data Distribution [9.564468846277366]
We characterize the effect real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm.
We introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits.
We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
arXiv Detail & Related papers (2020-03-18T07:55:49Z) - DP-MERF: Differentially Private Mean Embeddings with Random Features for
Practical Privacy-Preserving Data Generation [11.312036995195594]
We propose a differentially private data generation paradigm using random feature representations of kernel mean embeddings.
We exploit the random feature representations for two important benefits.
Our algorithm achieves drastically better privacy-utility trade-offs than existing methods when tested on several datasets.
arXiv Detail & Related papers (2020-02-26T16:41:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.