Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization
- URL: http://arxiv.org/abs/2109.04697v1
- Date: Fri, 10 Sep 2021 07:01:15 GMT
- Title: Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization
- Authors: Cheng Yang and Gene Cheung and Wai-tian Tan and Guangtao Zhai
- Abstract summary: Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
- Score: 59.87663954467815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithm unfolding creates an interpretable and parsimonious neural network
architecture by implementing each iteration of a model-based algorithm as a
neural layer. However, unfolding a proximal splitting algorithm with a positive
semi-definite (PSD) cone projection operator per iteration is expensive, due to
the required full matrix eigen-decomposition. In this paper, leveraging a
recent linear algebraic theorem called Gershgorin disc perfect alignment
(GDPA), we unroll a projection-free algorithm for semi-definite programming
relaxation (SDR) of a binary graph classifier, where the PSD cone constraint is
replaced by a set of "tightest possible" linear constraints per iteration. As a
result, each iteration only requires computing a linear program (LP) and one
extreme eigenvector. Inside the unrolled network, we optimize parameters via
stochastic gradient descent (SGD) that determine graph edge weights in two
ways: i) a metric matrix that computes feature distances, and ii) a sparse
weight matrix computed via local linear embedding (LLE). Experimental results
show that our unrolled network outperformed pure model-based graph classifiers,
and achieved comparable performance to pure data-driven networks but using far
fewer parameters.
Related papers
- Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks [15.074950361970194]
We provide a unified analysis for a family of algorithms that encompasses IRLS, the recently proposed linlin-RFM algorithm, and the alternating diagonal neural networks.
We show that, with appropriately chosen reweighting policy, a handful of sparse structures can achieve favorable performance.
We also show that leveraging this in the reweighting scheme provably improves test error compared to coordinate-wise reweighting.
arXiv Detail & Related papers (2024-06-04T20:37:17Z) - A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning [4.258375398293221]
Bi-cADMM is aimed at solving large-scale regularized Sparse Machine Learning problems defined over a network of computational nodes.
Bi-cADMM is implemented within an open-source Python package called Parallel Sparse Fitting Toolbox.
arXiv Detail & Related papers (2024-05-25T15:11:34Z) - An Alternative Graphical Lasso Algorithm for Precision Matrices [0.0]
We present a new/improved (dual-primal) DP-GLasso algorithm for estimating sparse precision matrices.
We show that the regularized normal log-likelihood naturally decouples into a sum of two easy to minimize convex functions one of which is a Lasso regression problem.
Our algorithm has the precision matrix as its optimization target right at the outset, and retains all the favorable properties of the DP-GLasso algorithm.
arXiv Detail & Related papers (2024-03-19T02:01:01Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Deep Unrolling for Nonconvex Robust Principal Component Analysis [75.32013242448151]
We design algorithms for Robust Component Analysis (A)
It consists in decomposing a matrix into the sum of a low Principaled matrix and a sparse Principaled matrix.
arXiv Detail & Related papers (2023-07-12T03:48:26Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Why Approximate Matrix Square Root Outperforms Accurate SVD in Global
Covariance Pooling? [59.820507600960745]
We propose a new GCP meta-layer that uses SVD in the forward pass, and Pad'e Approximants in the backward propagation to compute the gradients.
The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.
arXiv Detail & Related papers (2021-05-06T08:03:45Z) - Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors.
A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z) - Nonlinear system identification with regularized Tensor Network
B-splines [2.817412580574242]
The TNBS-NARX model is validated through the identification of the cascaded watertank benchmark nonlinear system.
It achieves state-of-the-art performance while identifying a 16-dimensional B-spline surface in 4 seconds on a standard desktop computer.
arXiv Detail & Related papers (2020-03-17T09:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.