Related papers: Achieving differential privacy for $k$-nearest neighbors based outlier detection by data partitioning

Achieving differential privacy for $k$-nearest neighbors based outlier detection by data partitioning

URL: http://arxiv.org/abs/2104.07938v1
Date: Fri, 16 Apr 2021 07:35:26 GMT
Title: Achieving differential privacy for $k$-nearest neighbors based outlier detection by data partitioning
Authors: Jens Rauch, Iyiola E. Olatunji and Megha Khosla
Abstract summary: We develop a differentially private ($epsilon$-DP) approach for $k$-NN based outlier detection. We propose a method for $k$-NN based outlier detection by separating the procedure into a fitting step on reference inlier data and then apply the outlier classifier to new data. Our approach yields nearly optimal performance on real-world data with varying dimensions when compared to the non-private versions of $k$-NN.
Score: 0.3437656066916039
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When applying outlier detection in settings where data is sensitive, mechanisms which guarantee the privacy of the underlying data are needed. The $k$-nearest neighbors ($k$-NN) algorithm is a simple and one of the most effective methods for outlier detection. So far, there have been no attempts made to develop a differentially private ($\epsilon$-DP) approach for $k$-NN based outlier detection. Existing approaches often relax the notion of $\epsilon$-DP and employ other methods than $k$-NN. We propose a method for $k$-NN based outlier detection by separating the procedure into a fitting step on reference inlier data and then apply the outlier classifier to new data. We achieve $\epsilon$-DP for both the fitting algorithm and the outlier classifier with respect to the reference data by partitioning the dataset into a uniform grid, which yields low global sensitivity. Our approach yields nearly optimal performance on real-world data with varying dimensions when compared to the non-private versions of $k$-NN.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Lightweight Protocols for Distributed Private Quantile Estimation [12.586899971090277]
We consider two emphadaptive algorithms for estimating one quantile when each user holds a single data point lying in a domain $[B]$. In the adaptive setting we present an $varepsilon$-LDP algorithm which can estimate any quantile within error $alpha$ only requiring $O(fraclog Bvarepsilon2alpha2)$ users.
arXiv Detail & Related papers (2025-02-05T08:39:02Z)
Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size. Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z)
Resampling methods for private statistical inference [1.8110941972682346]
We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data. For a fixed differential privacy parameter $epsilon$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$.
arXiv Detail & Related papers (2024-02-11T08:59:02Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Scalable Differentially Private Clustering via Hierarchically Separated Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee. Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z)
DP-PCA: Statistically Optimal and Differentially Private PCA [44.22319983246645]
DP-PCA is a single-pass algorithm that overcomes both limitations. For sub-Gaussian data, we provide nearly optimal statistical error rates even for $n=tilde O(d)$.
arXiv Detail & Related papers (2022-05-27T02:02:17Z)
Differentially Private Federated Learning via Inexact ADMM with Multiple Local Updates [0.0]
We develop a DP inexact alternating direction method of multipliers algorithm with multiple local updates for federated learning. We show that our algorithm provides $barepsilon$-DP for every iteration, where $barepsilon$ is a privacy budget controlled by the user. We demonstrate that our algorithm reduces the testing error by at most $31%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2022-02-18T19:58:47Z)
Differentially Private Federated Learning via Inexact ADMM [0.0]
Differential privacy (DP) techniques can be applied to the federated learning model to protect data privacy against inference attacks. We develop a DP inexact alternating direction method of multipliers algorithm that solves a sequence of trust-region subproblems. Our algorithm reduces the testing error by at most $22%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2021-06-11T02:28:07Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data. Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization. We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.