Conformal-DP: Data Density Aware Privacy on Riemannian Manifolds via Conformal Transformation
- URL: http://arxiv.org/abs/2504.20941v2
- Date: Fri, 06 Jun 2025 16:55:17 GMT
- Title: Conformal-DP: Data Density Aware Privacy on Riemannian Manifolds via Conformal Transformation
- Authors: Peilin He, Liou Tang, M. Amin Rahimian, James Joshi,
- Abstract summary: Differential Privacy (DP) enables privacy-preserving data analysis by adding calibrated noise.<n>We propose emphConformal-DP that utilizes conformal transformations.<n>We show through experiments on synthetic and real-world datasets that our mechanism achieves superior privacy-utility trade-offs.
- Score: 0.6981884305287337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential Privacy (DP) enables privacy-preserving data analysis by adding calibrated noise. While recent works extend DP to curved manifolds (e.g., diffusion-tensor MRI, social networks) by adding geodesic noise, these assume uniform data distribution. This assumption is not always practical, hence these approaches may introduce biased noise and suboptimal privacy-utility trade-offs for non-uniform data. To address this issue, we propose \emph{Conformal}-DP that utilizes conformal transformations on Riemannian manifolds. This approach locally equalizes sample density and redefines geodesic distances while preserving intrinsic manifold geometry. Our theoretical analysis demonstrates that the conformal factor, which is derived from local kernel density estimates, is data density-aware. We show that under these conformal metrics, \emph{Conformal}-DP satisfies $\varepsilon$-differential privacy on any complete Riemannian manifold and offers a closed-form expected geodesic error bound dependent only on the maximal density ratio, and not global curvature. We show through experiments on synthetic and real-world datasets that our mechanism achieves superior privacy-utility trade-offs, particularly for heterogeneous manifold data, and also is beneficial for homogeneous datasets.
Related papers
- What's Inside Your Diffusion Model? A Score-Based Riemannian Metric to Explore the Data Manifold [0.0]
We introduce a score-based Riemannian metric to characterize the intrinsic geometry of a data manifold.<n>Our approach creates a geometry where geodesics naturally follow the manifold's contours.<n>We show that our score-based geodesics capture meaningful perpendicular transformations that respect the underlying data distribution.
arXiv Detail & Related papers (2025-05-16T11:19:57Z) - Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry [0.0]
We introduce a geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions.<n>Motivated by the theory of optimal transport, we propose a Wasserstein-like distance that decomposes the discrepancy between two vMF distributions into two interpretable components.
arXiv Detail & Related papers (2025-04-19T03:38:15Z) - Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding [41.601022263772535]
Dimensionality reduction aims to simplify complex data by reducing its feature dimensionality while preserving essential patterns, with core applications in data analysis and visualisation.<n>To preserve the underlying data structure, multi-dimensional scaling (MDS) methods focus on preserving pairwise dissimilarities, such as distances.
arXiv Detail & Related papers (2025-03-23T10:03:22Z) - The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD.<n>We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.<n>Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - On the Inherent Privacy Properties of Discrete Denoising Diffusion Models [17.773335593043004]
We present the pioneering theoretical exploration of the privacy preservation inherent in discrete diffusion models.
Our framework elucidates the potential privacy leakage for each data point in a given training dataset.
Our bounds also show that training with $s$-sized data points leads to a surge in privacy leakage.
arXiv Detail & Related papers (2023-10-24T05:07:31Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation [58.03500081540042]
A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it.
We show that for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error.
arXiv Detail & Related papers (2023-01-31T18:47:42Z) - Shape And Structure Preserving Differential Privacy [70.08490462870144]
We show how the gradient of the squared distance function offers better control over sensitivity than the Laplace mechanism.
We also show how using the gradient of the squared distance function offers better control over sensitivity than the Laplace mechanism.
arXiv Detail & Related papers (2022-09-21T18:14:38Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Cycle Consistent Probability Divergences Across Different Spaces [38.43511529063335]
Discrepancy measures between probability distributions are at the core of statistical inference and machine learning.
This work proposes a novel unbalanced Monge optimal transport formulation for matching, up to isometries, distributions on different spaces.
arXiv Detail & Related papers (2021-11-22T16:35:58Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Statistical and Topological Properties of Gaussian Smoothed Sliced
Probability Divergences [9.080472817672259]
We show that smoothing and slicing preserve the metric property and the weak topology.
We also provide results on the sample complexity of such divergences.
arXiv Detail & Related papers (2021-10-20T12:21:32Z) - Uniform Interpolation Constrained Geodesic Learning on Data Manifold [28.509561636926414]
Along the learned geodesic, our method can generate high-qualitys between two given data samples.
We provide a theoretical analysis of our model and use image translation as an example to demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2020-02-12T07:47:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.