Privacy-Preserving Public Release of Datasets for Support Vector Machine
Classification
- URL: http://arxiv.org/abs/1912.12576v1
- Date: Sun, 29 Dec 2019 03:32:38 GMT
- Title: Privacy-Preserving Public Release of Datasets for Support Vector Machine
Classification
- Authors: Farhad Farokhi
- Abstract summary: We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects.
The dataset is systematically obfuscated using an additive noise for privacy protection.
Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other.
- Score: 14.095523601311374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of publicly releasing a dataset for support vector
machine classification while not infringing on the privacy of data subjects
(i.e., individuals whose private information is stored in the dataset). The
dataset is systematically obfuscated using an additive noise for privacy
protection. Motivated by the Cramer-Rao bound, inverse of the trace of the
Fisher information matrix is used as a measure of the privacy. Conditions are
established for ensuring that the classifier extracted from the original
dataset and the obfuscated one are close to each other (capturing the utility).
The optimal noise distribution is determined by maximizing a weighted sum of
the measures of privacy and utility. The optimal privacy-preserving noise is
proved to achieve local differential privacy. The results are generalized to a
broader class of optimization-based supervised machine learning algorithms.
Applicability of the methodology is demonstrated on multiple datasets.
Related papers
- Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - RASE: Efficient Privacy-preserving Data Aggregation against Disclosure Attacks for IoTs [2.1765174838950494]
We study the new paradigm for collecting and protecting the data produced by ever-increasing sensor devices.
Most previous studies on co-design of data aggregation and privacy preservation assume that a trusted fusion center adheres to privacy regimes.
We propose a novel paradigm (called RASE), which can be generalized into a 3-step sequential procedure, noise addition, followed by random permutation, and then parameter estimation.
arXiv Detail & Related papers (2024-05-31T15:21:38Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Privacy-Preserving Matrix Factorization for Recommendation Systems using
Gaussian Mechanism [2.84279467589473]
We propose a privacy-preserving recommendation system based on the differential privacy framework and matrix factorization.
As differential privacy is a powerful and robust mathematical framework for designing privacy-preserving machine learning algorithms, it is possible to prevent adversaries from extracting sensitive user information.
arXiv Detail & Related papers (2023-04-11T13:50:39Z) - Privacy-preserving Non-negative Matrix Factorization with Outliers [2.84279467589473]
We focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework.
We propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data.
We show our proposed framework's performance in six real data sets.
arXiv Detail & Related papers (2022-11-02T19:42:18Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Bounding, Concentrating, and Truncating: Unifying Privacy Loss
Composition for Data Analytics [2.614355818010333]
We provide strong privacy loss bounds when an analyst may select pure DP, bounded range (e.g. exponential mechanisms) or concentrated DP mechanisms in any order.
We also provide optimal privacy loss bounds that apply when an analyst can select pure DP and bounded range mechanisms in a batch.
arXiv Detail & Related papers (2020-04-15T17:33:10Z) - Privacy-Preserving Boosting in the Local Setting [17.375582978294105]
In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one.
In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information.
Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners.
arXiv Detail & Related papers (2020-02-06T04:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.