Privacy-preserving Logistic Regression with Secret Sharing
- URL: http://arxiv.org/abs/2105.06869v1
- Date: Fri, 14 May 2021 14:53:50 GMT
- Title: Privacy-preserving Logistic Regression with Secret Sharing
- Authors: Ali Reza Ghavamipour, Fatih Turkmen, Xiaoqian Jian
- Abstract summary: We propose secret sharing-based privacy-preserving logistic regression protocols using the Newton-Raphson method.
Our implementation results show that our improved method can handle large datasets used in securely training a logistic regression from multiple sources.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Logistic regression (LR) is a widely used classification method for modeling
binary outcomes in many medical data classification tasks. Research that
collects and combines datasets from various data custodians and jurisdictions
can excessively benefit from the increased statistical power to support their
analyzing goals. However, combining data from these various sources creates
significant privacy concerns that need to be addressed. In this paper, we
proposed secret sharing-based privacy-preserving logistic regression protocols
using the Newton-Raphson method. Our proposed approaches are based on secure
Multi-Party Computation (MPC) with different security settings to analyze data
owned by several data holders. We conducted experiments on both synthetic data
and real-world datasets and compared the efficiency and accuracy of them with
those of an ordinary logistic regression model. Experimental results
demonstrate that the proposed protocols are highly efficient and accurate. This
study introduces iterative algorithms to simplify the federated training a
logistic regression model in a privacy-preserving manner. Our implementation
results show that our improved method can handle large datasets used in
securely training a logistic regression from multiple sources.
Related papers
- TRIAGE: Characterizing and auditing training data for improved
regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors.
TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score.
We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z) - Online Efficient Secure Logistic Regression based on Function Secret Sharing [15.764294489590041]
We propose an online efficient protocol for privacy-preserving logistic regression based on Function Secret Sharing (FSS)
Our protocols are designed in the two non-colluding servers setting and assume the existence of a third-party dealer.
We propose accurate and MPC-friendly alternatives to the sigmoid function and encapsulate the logistic regression training process into a function secret sharing gate.
arXiv Detail & Related papers (2023-09-18T04:50:54Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Efficient Logistic Regression with Local Differential Privacy [0.0]
Internet of Things devices are expanding rapidly and generating huge amount of data.
There is an increasing need to explore data collected from these devices.
Collaborative learning provides a strategic solution for the Internet of Things settings but also raises public concern over data privacy.
arXiv Detail & Related papers (2022-02-05T22:44:03Z) - Data Fusion with Latent Map Gaussian Processes [0.0]
Multi-fidelity modeling and calibration are data fusion tasks that ubiquitously arise in engineering design.
We introduce a novel approach based on latent-map Gaussian processes (LMGPs) that enables efficient and accurate data fusion.
arXiv Detail & Related papers (2021-12-04T00:54:19Z) - Relationship-aware Multivariate Sampling Strategy for Scientific
Simulation Data [4.2855912967712815]
In this work, we propose a multivariate sampling strategy which preserves the original variable relationships.
Our proposed strategy utilizes principal component analysis to capture the variance of multivariate data and can be built on top of any existing state-of-the-art sampling algorithms for single variables.
arXiv Detail & Related papers (2020-08-31T00:52:17Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - High Performance Logistic Regression for Privacy-Preserving Genome
Analysis [15.078027648304117]
We present a secure logistic regression training protocol and its implementation, with a new subprotocol to securely compute the activation function.
We present the fastest existing secure Multi-Party Computation implementation for training logistic regression models on high dimensional genome data distributed across a local area network.
arXiv Detail & Related papers (2020-02-13T07:37:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.