Related papers: Efficient differentiable quadratic programming layers: an ADMM approach

Efficient differentiable quadratic programming layers: an ADMM approach

URL: http://arxiv.org/abs/2112.07464v1
Date: Tue, 14 Dec 2021 15:25:07 GMT
Title: Efficient differentiable quadratic programming layers: an ADMM approach
Authors: Andrew Butler and Roy Kwon
Abstract summary: We present an alternative network layer architecture based on the alternating direction method of multipliers (ADMM) Backward differentiation is performed by implicit differentiation of the residual map of a modified fixed-point iteration. Simulated results demonstrate the computational advantage of the ADMM layer, which for medium scaled problems is approximately an order of magnitude faster than the OptNet quadratic programming layer.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in neural-network architecture allow for seamless integration of convex optimization problems as differentiable layers in an end-to-end trainable neural network. Integrating medium and large scale quadratic programs into a deep neural network architecture, however, is challenging as solving quadratic programs exactly by interior-point methods has worst-case cubic complexity in the number of variables. In this paper, we present an alternative network layer architecture based on the alternating direction method of multipliers (ADMM) that is capable of scaling to problems with a moderately large number of variables. Backward differentiation is performed by implicit differentiation of the residual map of a modified fixed-point iteration. Simulated results demonstrate the computational advantage of the ADMM layer, which for medium scaled problems is approximately an order of magnitude faster than the OptNet quadratic programming layer. Furthermore, our novel backward-pass routine is efficient, from both a memory and computation standpoint, in comparison to the standard approach based on unrolled differentiation or implicit differentiation of the KKT optimality conditions. We conclude with examples from portfolio optimization in the integrated prediction and optimization paradigm.

Related papers

Layer Separation Deep Learning Model with Auxiliary Variables for Partial Differential Equations [0.6215404942415159]
We propose a new framework, the layer separation (LySep) model, to improve the deep learning-based methods in partial equations.<n>We provide theoretical analyses demonstrating the advantages of LySep in minimizing loss and reducing solution error.
arXiv Detail & Related papers (2025-07-17T03:43:18Z)
BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning [17.662882719189373]
We introduce BPQP, a differentiable convex optimization framework for efficient end-to-end learning. To enhance efficiency, we reformulate the backward pass as a simplified and decoupled quadratic programming problem. Extensive experiments on both simulated and real-world datasets demonstrate that BPQP achieves a significant improvement in efficiency.
arXiv Detail & Related papers (2024-11-28T17:31:15Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Differentiation Through Black-Box Quadratic Programming Solvers [16.543673072027136]
We introduce dQP, a modular framework that enables plug-and-play differentiation for any quadratic programming (QP) solver. Our solution is based on the core theoretical insight that knowledge of the active constraint set at the QP optimum allows for explicit differentiation. Our implementation, which will be made publicly available, interfaces with an existing framework that supports over 15 state-of-the-art QP solvers.
arXiv Detail & Related papers (2024-10-08T20:01:39Z)
Probabilistic partition of unity networks for high-dimensional regression problems [1.0227479910430863]
We explore the partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems. We propose a general framework focusing on adaptive dimensionality reduction. The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments.
arXiv Detail & Related papers (2022-10-06T06:01:36Z)
A Convergent ADMM Framework for Efficient Neural Network Training [17.764095204676973]
Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. We propose a novel framework to solve a general neural network training problem via ADMM (dlADMM) to address these challenges simultaneously. Experiments on seven benchmark datasets demonstrate the convergence, efficiency, and effectiveness of our proposed dlADMM algorithm.
arXiv Detail & Related papers (2021-12-22T01:55:24Z)
Neural Stochastic Dual Dynamic Programming [99.80617899593526]
We introduce a trainable neural model that learns to map problem instances to a piece-wise linear value function. $nu$-SDDP can significantly reduce problem solving cost without sacrificing solution quality.
arXiv Detail & Related papers (2021-12-01T22:55:23Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Deep unfolding of the weighted MMSE beamforming algorithm [9.518010235273783]
We propose the novel application of deep unfolding to the WMMSE algorithm for a MISO downlink channel. Deep unfolding naturally incorporates expert knowledge, with the benefits of immediate and well-grounded architecture selection, fewer trainable parameters, and better explainability. By means of simulations, we show that, in most of the settings, the unfolded WMMSE outperforms or performs equally to the WMMSE for a fixed number of iterations.
arXiv Detail & Related papers (2020-06-15T14:51:20Z)
Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed. An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed. We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction [69.94831587339539]
We present a modified Cross-Entropy Method (CEM) that uses a masked auto-regressive neural network for modeling uniform distributions over the solution space. Our algorithm is able to express complicated solution spaces, thus allowing it to track a variety of different solution regions.
arXiv Detail & Related papers (2020-02-17T20:21:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.