Related papers: Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

URL: http://arxiv.org/abs/2212.03334v1
Date: Tue, 6 Dec 2022 21:35:35 GMT
Title: Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning
Authors: Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong
Abstract summary: Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. We perform first systematic, principled measurement study to understand whether and when a pretrained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms.
Score: 63.45532264721498
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial examples on the security side as well as 2) inference attacks and the right to be forgotten for the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms. Our key findings are that a pre-trained encoder substantially improves 1) both accuracy under no attacks and certified security guarantees against data poisoning and backdoor attacks of state-of-the-art secure learning algorithms (i.e., bagging and KNN), 2) certified security guarantees of randomized smoothing against adversarial examples without sacrificing its accuracy under no attacks, 3) accuracy of differentially private classifiers, and 4) accuracy and/or efficiency of exact machine unlearning.

Related papers

Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting. Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines. Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z)
Ensuring superior learning outcomes and data security for authorized learner [0.4166512373146748]
A learner's ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. It is important to ensure the performance of the "authorized" learner by limiting the quality of the training data accessible to eavesdroppers. We provide a theorem to ensure superior learning outcomes exclusively for the authorized learner with quantum label encoding.
arXiv Detail & Related papers (2025-01-01T06:49:00Z)
Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning. One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z)
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption [4.164336621664897]
HETAL is an efficient Homomorphic Encryption based Transfer Learning algorithm. We propose an encrypted matrix multiplication algorithm, which is 1.8 to 323 times faster than prior methods. Experiments show total training times of 567-3442 seconds, which is less than an hour.
arXiv Detail & Related papers (2024-03-21T03:47:26Z)
Roulette: A Semantic Privacy-Preserving Device-Edge Collaborative Inference Framework for Deep Learning Classification Tasks [21.05961694765183]
Roulette is a task-oriented semantic privacy-preserving collaborative inference framework for deep learning classifiers. We develop a novel paradigm of split learning where the back-end is frozen and the front-end is retrained to be both a feature extractor and an encryptor.
arXiv Detail & Related papers (2023-09-06T08:08:12Z)
Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight. They only give tight estimates under implausible worst-case assumptions. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z)
Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model. We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z)
Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets [53.866927712193416]
We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak private details belonging to other parties. Our attacks are effective across membership inference, attribute inference, and data extraction. Our results cast doubts on the relevance of cryptographic privacy guarantees in multiparty protocols for machine learning.
arXiv Detail & Related papers (2022-03-31T18:06:28Z)
One Parameter Defense -- Defending against Data Inference Attacks via Differential Privacy [26.000487178636927]
Machine learning models are vulnerable to data inference attacks, such as membership inference and model inversion attacks. Most existing defense methods only protect against membership inference attacks. We propose a differentially private defense method that handles both types of attacks in a time-efficient manner.
arXiv Detail & Related papers (2022-03-13T06:06:24Z)
On Deep Learning with Label Differential Privacy [54.45348348861426]
We study the multi-class classification setting where the labels are considered sensitive and ought to be protected. We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets.
arXiv Detail & Related papers (2021-02-11T15:09:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.