Alternating Mahalanobis Distance Minimization for Stable and Accurate CP
Decomposition
- URL: http://arxiv.org/abs/2204.07208v1
- Date: Thu, 14 Apr 2022 19:56:36 GMT
- Title: Alternating Mahalanobis Distance Minimization for Stable and Accurate CP
Decomposition
- Authors: Navjot Singh, Edgar Solomonik
- Abstract summary: We introduce a new formulation for deriving singular values and vectors of a tensor by considering the critical points of a function different from what is used in the previous work.
We show that a subsweep of this algorithm can achieve a superlinear convergence rate for exact CPD with known rank.
We then view the algorithm as optimizing a Mahalanobis distance with respect to each factor with ground metric dependent on the other factors.
- Score: 4.847980206213335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: CP decomposition (CPD) is prevalent in chemometrics, signal processing, data
mining and many more fields. While many algorithms have been proposed to
compute the CPD, alternating least squares (ALS) remains one of the most widely
used algorithm for computing the decomposition. Recent works have introduced
the notion of eigenvalues and singular values of a tensor and explored
applications of eigenvectors and singular vectors in areas like signal
processing, data analytics and in various other fields. We introduce a new
formulation for deriving singular values and vectors of a tensor by considering
the critical points of a function different from what is used in the previous
work. Computing these critical points in an alternating manner motivates an
alternating optimization algorithm which corresponds to alternating least
squares algorithm in the matrix case. However, for tensors with order greater
than equal to $3$, it minimizes an objective function which is different from
the commonly used least squares loss. Alternating optimization of this new
objective leads to simple updates to the factor matrices with the same
asymptotic computational cost as ALS. We show that a subsweep of this algorithm
can achieve a superlinear convergence rate for exact CPD with known rank and
verify it experimentally. We then view the algorithm as optimizing a
Mahalanobis distance with respect to each factor with ground metric dependent
on the other factors. This perspective allows us to generalize our approach to
interpolate between updates corresponding to the ALS and the new algorithm to
manage the tradeoff between stability and fitness of the decomposition. Our
experimental results show that for approximating synthetic and real-world
tensors, this algorithm and its variants converge to a better conditioned
decomposition with comparable and sometimes better fitness as compared to the
ALS algorithm.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Variance reduction techniques for stochastic proximal point algorithms [5.374800961359305]
In this work, we propose the first unified study of variance reduction techniques for proximal point algorithms.
We introduce a generic proximal-based algorithm that can be specified to give the proximal version of SVRG, SAGA, and some of their variants.
Our experiments demonstrate the advantages of the proximal variance reduction methods over their gradient counterparts.
arXiv Detail & Related papers (2023-08-18T05:11:50Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - New Riemannian preconditioned algorithms for tensor completion via
polyadic decomposition [10.620193291237262]
These algorithms exploit a non-Euclidean metric on the product space of the factor matrices of the low-rank tensor in the polyadic decomposition form.
We prove that the proposed gradient descent algorithm globally converges to a stationary point of the tensor completion problem.
Numerical results on synthetic and real-world data suggest that the proposed algorithms are more efficient in memory and time compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-01-26T22:11:06Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Multi-kernel Passive Stochastic Gradient Algorithms and Transfer
Learning [21.796874356469644]
The gradient algorithm does not have control over the location where noisy gradients of the cost function are evaluated.
The algorithm performs substantially better in high dimensional problems and incorporates variance reduction.
arXiv Detail & Related papers (2020-08-23T11:55:19Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.