Alternating Direction Method of Multipliers for Quantization
- URL: http://arxiv.org/abs/2009.03482v2
- Date: Mon, 1 Mar 2021 06:22:25 GMT
- Title: Alternating Direction Method of Multipliers for Quantization
- Authors: Tianjian Huang, Prajwal Singhania, Maziar Sanjabi, Pabitra Mitra and
Meisam Razaviyayn
- Abstract summary: We study the performance of the Alternating Direction Method of Multipliers for Quantization ($texttADMM-Q$) algorithm.
We develop a few variants of $texttADMM-Q$ that can handle inexact update rules.
We empirically evaluate the efficacy of our proposed approaches.
- Score: 15.62692130672419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantization of the parameters of machine learning models, such as deep
neural networks, requires solving constrained optimization problems, where the
constraint set is formed by the Cartesian product of many simple discrete sets.
For such optimization problems, we study the performance of the Alternating
Direction Method of Multipliers for Quantization ($\texttt{ADMM-Q}$) algorithm,
which is a variant of the widely-used ADMM method applied to our discrete
optimization problem. We establish the convergence of the iterates of
$\texttt{ADMM-Q}$ to certain $\textit{stationary points}$. To the best of our
knowledge, this is the first analysis of an ADMM-type method for problems with
discrete variables/constraints. Based on our theoretical insights, we develop a
few variants of $\texttt{ADMM-Q}$ that can handle inexact update rules, and
have improved performance via the use of "soft projection" and "injecting
randomness to the algorithm". We empirically evaluate the efficacy of our
proposed approaches.
Related papers
- RIGA: A Regret-Based Interactive Genetic Algorithm [14.388696798649658]
We propose an interactive genetic algorithm for solving multi-objective optimization problems under preference imprecision.
Our algorithm, called RIGA, can be applied to any multi-objective optimization problem provided that the aggregation function is linear in its parameters.
For several performance indicators (computation times, gap to optimality and number of queries), RIGA obtains better results than state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-10T13:56:15Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
Worlds in Stochastic and Deterministic Environments [48.96971760679639]
We study variance-dependent regret bounds for Markov decision processes (MDPs)
We propose two new environment norms to characterize the fine-grained variance properties of the environment.
For model-based methods, we design a variant of the MVP algorithm.
In particular, this bound is simultaneously minimax optimal for both and deterministic MDPs.
arXiv Detail & Related papers (2023-01-31T06:54:06Z) - Regret Bounds for Expected Improvement Algorithms in Gaussian Process
Bandit Optimization [63.8557841188626]
The expected improvement (EI) algorithm is one of the most popular strategies for optimization under uncertainty.
We propose a variant of EI with a standard incumbent defined via the GP predictive mean.
We show that our algorithm converges, and achieves a cumulative regret bound of $mathcal O(gamma_TsqrtT)$.
arXiv Detail & Related papers (2022-03-15T13:17:53Z) - A Variational Inference Approach to Inverse Problems with Gamma
Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors.
The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z) - Parallel Surrogate-assisted Optimization Using Mesh Adaptive Direct
Search [0.0]
We present a method that employs surrogate models and concurrent computing at the search step of the mesh adaptive direct search (MADS) algorithm.
We conduct numerical experiments to assess the performance of the modified MADS algorithm with respect to available CPU resources.
arXiv Detail & Related papers (2021-07-26T18:28:56Z) - Converting ADMM to a Proximal Gradient for Convex Optimization Problems [4.56877715768796]
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem.
This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex.
We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
arXiv Detail & Related papers (2021-04-22T07:41:12Z) - Grouped Variable Selection with Discrete Optimization: Computational and
Statistical Perspectives [9.593208961737572]
We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization.
Our methodology covers both high-dimensional linear regression and non- additive modeling with sparse programming.
Our exact algorithm is based on a standalone branch-and-bound (BnB) framework, which can solve the associated mixed integer (MIP) problem to certified optimality.
arXiv Detail & Related papers (2021-04-14T19:21:59Z) - Meta Learning Black-Box Population-Based Optimizers [0.0]
We propose the use of meta-learning to infer population-based blackbox generalizations.
We show that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context.
arXiv Detail & Related papers (2021-03-05T08:13:25Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.