Signed Graph Metric Learning via Gershgorin Disc Perfect Alignment
- URL: http://arxiv.org/abs/2006.08816v6
- Date: Fri, 11 Jun 2021 03:32:47 GMT
- Title: Signed Graph Metric Learning via Gershgorin Disc Perfect Alignment
- Authors: Cheng Yang, Gene Cheung, Wei Hu
- Abstract summary: We propose a fast general metric learning framework that is entirely projection-free.
We replace the PD cone constraint in the metric learning problem with possible linear constraints per distances.
Experiments show that our graph metric optimization is significantly faster than cone-projection schemes.
- Score: 46.145969174332485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a convex and differentiable objective $Q(\M)$ for a real symmetric
matrix $\M$ in the positive definite (PD) cone -- used to compute Mahalanobis
distances -- we propose a fast general metric learning framework that is
entirely projection-free. We first assume that $\M$ resides in a space $\cS$ of
generalized graph Laplacian matrices corresponding to balanced signed graphs.
$\M \in \cS$ that is also PD is called a graph metric matrix. Unlike low-rank
metric matrices common in the literature, $\cS$ includes the important
diagonal-only matrices as a special case. The key theorem to circumvent full
eigen-decomposition and enable fast metric matrix optimization is Gershgorin
disc perfect alignment (GDPA): given $\M \in \cS$ and diagonal matrix $\S$,
where $S_{ii} = 1/v_i$ and $\v$ is $\M$'s first eigenvector, we prove that
Gershgorin disc left-ends of similarity transform $\B = \S \M \S^{-1}$ are
perfectly aligned at the smallest eigenvalue $\lambda_{\min}$. Using this
theorem, we replace the PD cone constraint in the metric learning problem with
tightest possible linear constraints per iteration, so that the alternating
optimization of the diagonal / off-diagonal terms in $\M$ can be solved
efficiently as linear programs via the Frank-Wolfe method. We update $\v$ using
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) with warm
start as entries in $\M$ are optimized successively. Experiments show that our
graph metric optimization is significantly faster than cone-projection schemes,
and produces competitive binary classification performance.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment [48.137527345353625]
User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive.
We summarize a transitory UGV into several discs in linear time via fast graph sampling based on Gershgorin disc alignment (GDA)
We show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2024-08-03T20:08:02Z) - Optimal Matrix Sketching over Sliding Windows [38.09464189820894]
Matrix sketching, aimed at approximating a matrix $boldsymbolA in mathbbRNtimes d$ consists of vector streams of length $N$ with a smaller sketching matrix $boldsymbolB in mathbbRelltimes d, ell ll N$.
We introduce the DS-FD algorithm, which achieves the optimal $Oleftfracdvarepsilonright)$ space bound for matrix sketching over row-normalized, sequence-based sliding windows
arXiv Detail & Related papers (2024-05-13T14:38:35Z) - Solving Dense Linear Systems Faster Than via Preconditioning [1.8854491183340518]
We show that our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$.
In particular, our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$.
Our main algorithm can be viewed as a randomized block coordinate descent method.
arXiv Detail & Related papers (2023-12-14T12:53:34Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Graph Metric Learning via Gershgorin Disc Alignment [46.145969174332485]
We propose a fast general projection-free metric learning framework, where the objective $min_textbfM in mathcalS$ is a convex differentiable function of the metric matrix $textbfM$.
We prove that the Gershgorin discs can be aligned perfectly using the first eigenvector $textbfv$ of $textbfM$.
Experiments show that our efficiently computed graph metric matrices outperform metrics learned using competing methods in terms of classification tasks.
arXiv Detail & Related papers (2020-01-28T17:44:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.