Coupling public and private gradient provably helps optimization
- URL: http://arxiv.org/abs/2310.01304v1
- Date: Mon, 2 Oct 2023 16:08:18 GMT
- Title: Coupling public and private gradient provably helps optimization
- Authors: Ruixuan Liu, Zhiqi Bu, Yu-xiang Wang, Sheng Zha, George Karypis
- Abstract summary: The success of large neural networks is crucially determined by the availability of data.
It has been observed that training only on a small amount of public data can lead to degradation of accuracy.
- Score: 38.80873569002277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of large neural networks is crucially determined by the
availability of data. It has been observed that training only on a small amount
of public data, or privately on the abundant private data can lead to
undesirable degradation of accuracy. In this work, we leverage both private and
public data to improve the optimization, by coupling their gradients via a
weighted linear combination. We formulate an optimal solution for the optimal
weight in the convex setting to indicate that the weighting coefficient should
be hyperparameter-dependent. Then, we prove the acceleration in the convergence
of non-convex loss and the effects of hyper-parameters such as privacy budget,
number of iterations, batch size, and model size on the choice of the weighting
coefficient. We support our analysis with empirical experiments across language
and vision benchmarks, and provide a guideline for choosing the optimal weight
of the gradient coupling.
Related papers
- Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Quantization Avoids Saddle Points in Distributed Optimization [1.579622195923387]
Distributed non optimization underpins key functionalities of numerous distributed systems.
The aim of this paper is to prove that it can effectively escape saddle points convergence to a second-order stationary point convergence.
With an easily adjustable quantization, the approach allows a user control to aggressively reduce communication overhead.
arXiv Detail & Related papers (2024-03-15T15:58:20Z) - Online Sensitivity Optimization in Differentially Private Learning [8.12606646175019]
We present a novel approach to dynamically optimize the clipping threshold.
We treat this threshold as an additional learnable parameter, establishing a clean relationship between the threshold and the cost function.
Our method is thoroughly assessed against alternative fixed and adaptive strategies across diverse datasets, tasks, model dimensions, and privacy levels.
arXiv Detail & Related papers (2023-10-02T00:30:49Z) - Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications.
Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure.
We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Debiasing In-Sample Policy Performance for Small-Data, Large-Scale
Optimization [4.554894288663752]
We propose a novel estimator of the out-of-sample performance of a policy in data-driven optimization.
Unlike cross-validation, our approach avoids sacrificing data for a test set.
We prove our estimator performs well in the small-data, largescale regime.
arXiv Detail & Related papers (2021-07-26T19:00:51Z) - Enhanced data efficiency using deep neural networks and Gaussian
processes for aerodynamic design optimization [0.0]
Adjoint-based optimization methods are attractive for aerodynamic shape design.
They can become prohibitively expensive when multiple optimization problems are being solved.
We propose a machine learning enabled, surrogate-based framework that replaces the expensive adjoint solver.
arXiv Detail & Related papers (2020-08-15T15:09:21Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.