Related papers: Privacy-preserving Logistic Regression with Secret Sharing

Privacy-preserving Logistic Regression with Secret Sharing

URL: http://arxiv.org/abs/2105.06869v1
Date: Fri, 14 May 2021 14:53:50 GMT
Title: Privacy-preserving Logistic Regression with Secret Sharing
Authors: Ali Reza Ghavamipour, Fatih Turkmen, Xiaoqian Jian
Abstract summary: We propose secret sharing-based privacy-preserving logistic regression protocols using the Newton-Raphson method. Our implementation results show that our improved method can handle large datasets used in securely training a logistic regression from multiple sources.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Research that collects and combines datasets from various data custodians and jurisdictions can excessively benefit from the increased statistical power to support their analyzing goals. However, combining data from these various sources creates significant privacy concerns that need to be addressed. In this paper, we proposed secret sharing-based privacy-preserving logistic regression protocols using the Newton-Raphson method. Our proposed approaches are based on secure Multi-Party Computation (MPC) with different security settings to analyze data owned by several data holders. We conducted experiments on both synthetic data and real-world datasets and compared the efficiency and accuracy of them with those of an ordinary logistic regression model. Experimental results demonstrate that the proposed protocols are highly efficient and accurate. This study introduces iterative algorithms to simplify the federated training a logistic regression model in a privacy-preserving manner. Our implementation results show that our improved method can handle large datasets used in securely training a logistic regression from multiple sources.

Related papers

Federated Online Learning for Heterogeneous Multisource Streaming Data [0.0]
Federated learning has emerged as an essential paradigm for distributed multi-source data analysis under privacy concerns.<n>In this paper, we propose a federated online learning (FOL) method for distributed multi-source streaming data analysis.
arXiv Detail & Related papers (2025-08-08T19:08:53Z)
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning. Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning. We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
DUPRE: Data Utility Prediction for Efficient Data Valuation [49.60564885180563]
Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility and retraining the ML model for multiple data subsets.<n>Our framework, textttDUPRE, takes an alternative yet complementary approach that reduces the cost per subset evaluation by predicting data utilities instead of evaluating them by model retraining.<n>Specifically, given the evaluated data utilities of some data subsets, textttDUPRE fits a emphGaussian process (GP) regression model to predict the utility of every other data subset.
arXiv Detail & Related papers (2025-02-22T08:53:39Z)
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data [54.3895971080712]
Fine-tuning large language models (LLMs) using diverse datasets is crucial for enhancing their overall performance across various domains.<n>We propose a new method that gives the LLM a dual identity: an output model to cognitively probe and select data based on diversity reward, as well as an input model to be tuned with the selected data.
arXiv Detail & Related papers (2025-02-05T17:21:01Z)
A Two-Stage Federated Learning Approach for Industrial Prognostics Using Large-Scale High-Dimensional Signals [1.2277343096128712]
Industrial prognostics aims to develop data-driven methods that leverage high-dimensional degradation signals from assets to predict their failure times. In practice, individual organizations often lack sufficient data to independently train reliable prognostic models. This article proposes a statistical learning-based federated model that enables multiple organizations to jointly train a prognostic model.
arXiv Detail & Related papers (2024-10-14T21:26:22Z)
PP-GWAS: Privacy Preserving Multi-Site Genome-wide Association Studies [2.516577526761521]
We present a novel algorithm PP-GWAS designed to improve upon existing standards in terms of computational efficiency and scalability without sacrificing data privacy. Experimental evaluation with real world and synthetic data indicates that PP-GWAS can achieve computational speeds twice as fast as similar state-of-the-art algorithms. We have assessed its performance using various datasets, emphasizing its potential in facilitating more efficient and private genomic analyses.
arXiv Detail & Related papers (2024-10-10T17:07:57Z)
LFFR: Logistic Function For (multi-output) Regression [0.0]
We build upon previous work on privacy-preserving regression to address multi-output regression problems. We adapt our novel LFFR algorithm, initially designed for single-output logistic regression, to handle multiple outputs. Evaluations on multiple real-world datasets demonstrate the effectiveness of our multi-output LFFR algorithm.
arXiv Detail & Related papers (2024-07-30T20:52:38Z)
TRIAGE: Characterizing and auditing training data for improved regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors. TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score. We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z)
Online Efficient Secure Logistic Regression based on Function Secret Sharing [15.764294489590041]
We propose an online efficient protocol for privacy-preserving logistic regression based on Function Secret Sharing (FSS) Our protocols are designed in the two non-colluding servers setting and assume the existence of a third-party dealer. We propose accurate and MPC-friendly alternatives to the sigmoid function and encapsulate the logistic regression training process into a function secret sharing gate.
arXiv Detail & Related papers (2023-09-18T04:50:54Z)
Source-Free Collaborative Domain Adaptation via Multi-Perspective Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis. Many methods have been proposed to reduce fMRI heterogeneity between source and target domains. But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies. We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z)
Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching. Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z)
Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data. Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Efficient Logistic Regression with Local Differential Privacy [0.0]
Internet of Things devices are expanding rapidly and generating huge amount of data. There is an increasing need to explore data collected from these devices. Collaborative learning provides a strategic solution for the Internet of Things settings but also raises public concern over data privacy.
arXiv Detail & Related papers (2022-02-05T22:44:03Z)
Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately. Our approach should be preferred if the goal is to select as many relevant predictors as possible. Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.