Related papers: Beyond One-Size-Fits-All: Neural Networks for Differentially Private Tabular Data Synthesis

Beyond One-Size-Fits-All: Neural Networks for Differentially Private Tabular Data Synthesis

URL: http://arxiv.org/abs/2511.13893v1
Date: Mon, 17 Nov 2025 20:33:34 GMT
Title: Beyond One-Size-Fits-All: Neural Networks for Differentially Private Tabular Data Synthesis
Authors: Kai Chen, Chen Gong, Tianhao Wang,
Abstract summary: We propose MargNet, incorporating successful algorithmic designs of statistical models into neural networks.<n> MargNet applies an adaptive marginal selection strategy and trains the neural networks to generate data that conforms to the selected marginals.<n>More importantly, on densely correlated datasets, MargNet establishes a new state-of-the-art, reducing fidelity error by up to 26% compared to the previous best.
Score: 21.22625929242754
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In differentially private (DP) tabular data synthesis, the consensus is that statistical models are better than neural network (NN)-based methods. However, we argue that this conclusion is incomplete and overlooks the challenge of densely correlated datasets, where intricate dependencies can overwhelm statistical models. In such complex scenarios, neural networks are more suitable due to their capacity to fit complex distributions by learning directly from samples. Despite this potential, existing NN-based algorithms still suffer from significant limitations. We therefore propose MargNet, incorporating successful algorithmic designs of statistical models into neural networks. MargNet applies an adaptive marginal selection strategy and trains the neural networks to generate data that conforms to the selected marginals. On sparsely correlated datasets, our approach achieves utility close to the best statistical method while offering an average 7$\times$ speedup over it. More importantly, on densely correlated datasets, MargNet establishes a new state-of-the-art, reducing fidelity error by up to 26\% compared to the previous best. We release our code on GitHub.\footnote{https://github.com/KaiChen9909/margnet}

Related papers

Neural Parameter Estimation with Incomplete Data [0.0]
It is not straightforward to use neural networks with data that for various reasons are incomplete.<n>A recently proposed approach to remedy this issue inputs an appropriately padded data vector and a vector that encodes the missingness pattern to a neural network.<n>Here, we propose an alternative approach that is based on the Monte Carlo expectation-maximization (EM) algorithm.
arXiv Detail & Related papers (2025-01-08T08:05:17Z)
LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks. We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z)
Marginal Debiased Network for Fair Visual Recognition [59.05212866862219]
We propose a novel marginal debiased network (MDN) to learn debiased representations. Our MDN can achieve a remarkable performance on under-represented samples.
arXiv Detail & Related papers (2024-01-04T08:57:09Z)
REST: Enhancing Group Robustness in DNNs through Reweighted Sparse Training [49.581884130880944]
Deep neural network (DNN) has been proven effective in various domains. However, they often struggle to perform well on certain minority groups during inference.
arXiv Detail & Related papers (2023-12-05T16:27:54Z)
GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data [14.124731264553889]
We propose GCondNet to enhance neural networks by leveraging implicit structures present in data. GCondNet exploits the data's high-dimensionality, and thus improves the performance of an underlying predictor network. We demonstrate GCondNet's effectiveness on 12 real-world datasets, where it outperforms 14 standard and state-of-the-art methods.
arXiv Detail & Related papers (2022-11-11T16:13:34Z)
Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data. We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z)
Integrating Random Effects in Deep Neural Networks [4.860671253873579]
We propose to use the mixed models framework to handle correlated data in deep neural networks. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios.
arXiv Detail & Related papers (2022-06-07T14:02:24Z)
Neural network relief: a pruning algorithm based on neural activity [47.57448823030151]
We propose a simple importance-score metric that deactivates unimportant connections. We achieve comparable performance for LeNet architectures on MNIST. The algorithm is not designed to minimize FLOPs when considering current hardware and software implementations.
arXiv Detail & Related papers (2021-09-22T15:33:49Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.