Banded Square Root Matrix Factorization for Differentially Private Model Training
- URL: http://arxiv.org/abs/2405.13763v1
- Date: Wed, 22 May 2024 15:47:35 GMT
- Title: Banded Square Root Matrix Factorization for Differentially Private Model Training
- Authors: Nikita Kalinin, Christoph Lampert,
- Abstract summary: We present a new matrix factorization approach, BSR, which overcomes this computational bottleneck.
By exploiting properties of the standard matrix square root, BSR allows to efficiently handle also large-scale problems.
Our numerical experiments demonstrate that models trained using BSR perform on par with the best existing methods, while completely avoiding their computational overhead.
- Score: 3.6371628922281305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current state-of-the-art methods for differentially private model training are based on matrix factorization techniques. However, these methods suffer from high computational overhead because they require numerically solving a demanding optimization problem to determine an approximately optimal factorization prior to the actual model training. In this work, we present a new matrix factorization approach, BSR, which overcomes this computational bottleneck. By exploiting properties of the standard matrix square root, BSR allows to efficiently handle also large-scale problems. For the key scenario of stochastic gradient descent with momentum and weight decay, we even derive analytical expressions for BSR that render the computational overhead negligible. We prove bounds on the approximation quality that hold both in the centralized and in the federated learning setting. Our numerical experiments demonstrate that models trained using BSR perform on par with the best existing methods, while completely avoiding their computational overhead.
Related papers
- Sparse Variational Student-t Processes [8.46450148172407]
Student-t Processes are used to model heavy-tailed distributions and datasets with outliers.
We propose a sparse representation framework to allow Student-t Processes to be more flexible for real-world datasets.
We evaluate two proposed approaches on various synthetic and real-world datasets from UCI and Kaggle.
arXiv Detail & Related papers (2023-12-09T12:55:20Z) - Adaptive operator learning for infinite-dimensional Bayesian inverse
problems [8.672948020721945]
We develop an adaptive operator learning framework that can reduce modeling error gradually by forcing the surrogate to be accurate in local areas.
We present a rigorous convergence guarantee in the linear case using the UKI framework.
The numerical results show that our method can significantly reduce computational costs while maintaining inversion accuracy.
arXiv Detail & Related papers (2023-10-27T01:50:33Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Large-scale gradient-based training of Mixtures of Factor Analyzers [67.21722742907981]
This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional training by gradient descent.
We prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed.
Besides the theoretical analysis and matrices, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.
arXiv Detail & Related papers (2023-08-26T06:12:33Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Loop Unrolled Shallow Equilibrium Regularizer (LUSER) -- A
Memory-Efficient Inverse Problem Solver [26.87738024952936]
In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted and often ill-posed measurements.
We propose an LU algorithm with shallow equilibrium regularizers (L)
These implicit models are as expressive as deeper convolutional networks, but far more memory efficient during training.
arXiv Detail & Related papers (2022-10-10T19:50:37Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Robust Regression via Model Based Methods [13.300549123177705]
We propose an algorithm inspired by so-called model-based optimization (MBO) [35, 36], which replaces a non-objective with a convex model function.
We apply this to robust regression, proposing SADM, a function of the Online Alternating Direction Method of Multipliers (OOADM) [50] to solve the inner optimization in MBO.
Finally, we demonstrate experimentally (a) the robustness of l_p norms to outliers and (b) the efficiency of our proposed model-based algorithms in comparison with methods on autoencoders and multi-target regression.
arXiv Detail & Related papers (2021-06-20T21:45:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.