NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient
Descent
- URL: http://arxiv.org/abs/2003.08844v1
- Date: Wed, 18 Mar 2020 04:44:05 GMT
- Title: NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient
Descent
- Authors: Ali Anaissi, Basem Suleiman, Seid Miad Zandavi
- Abstract summary: We propose a new efficient decomposition algorithm named NeCPD for non- efficient problem in multi-way online data based $(N)N.
We further apply this method in the field of reallife monitoring using structural datasets.
- Score: 1.0953917735844645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-way data analysis has become an essential tool for capturing underlying
structures in higher-order datasets stored in tensor $\mathcal{X} \in
\mathbb{R} ^{I_1 \times \dots \times I_N} $. $CANDECOMP/PARAFAC$ (CP)
decomposition has been extensively studied and applied to approximate
$\mathcal{X}$ by $N$ loading matrices $A^{(1)}, \dots, A^{(N)}$ where $N$
represents the order of the tensor. We propose a new efficient CP decomposition
solver named NeCPD for non-convex problem in multi-way online data based on
stochastic gradient descent (SGD) algorithm. SGD is very useful in online
setting since it allows us to update $\mathcal{X}^{(t+1)}$ in one single step.
In terms of global convergence, it is well known that SGD stuck in many saddle
points when it deals with non-convex problems. We study the Hessian matrix to
identify theses saddle points, and then try to escape them using the
perturbation approach which adds little noise to the gradient update step. We
further apply Nesterov's Accelerated Gradient (NAG) method in SGD algorithm to
optimally accelerate the convergence rate and compensate Hessian computational
delay time per epoch. Experimental evaluation in the field of structural health
monitoring using laboratory-based and real-life structural datasets show that
our method provides more accurate results compared with existing online tensor
analysis methods.
Related papers
- Modified Step Size for Enhanced Stochastic Gradient Descent: Convergence
and Experiments [0.0]
This paper introduces a novel approach to the performance of the gradient descent (SGD) algorithm by incorporating a modified decay step size based on $frac1sqrttt.
The proposed step size integrates a logarithmic step term, leading to the selection of smaller values in the final iteration.
To the effectiveness of our approach, we conducted numerical experiments on image classification tasks using the FashionMNIST, andARAR datasets.
arXiv Detail & Related papers (2023-09-03T19:21:59Z) - A Fast Parallel Tensor Decomposition with Optimal Stochastic Gradient
Descent: an Application in Structural Damage Identification [1.536989504296526]
We propose a novel algorithm, FP-CPD, to parallelize the CANDECOMP/PARAFAC (CP) decomposition of a tensor $mathcalX in mathbbR I_1 times dots times I_N $.
arXiv Detail & Related papers (2021-11-04T05:17:07Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Sample Complexity Bounds for Two Timescale Value-based Reinforcement
Learning Algorithms [65.09383385484007]
Two timescale approximation (SA) has been widely used in value-based reinforcement learning algorithms.
We study the non-asymptotic convergence rate of two timescale linear and nonlinear TDC and Greedy-GQ algorithms.
arXiv Detail & Related papers (2020-11-10T11:36:30Z) - Gradient-Based Empirical Risk Minimization using Local Polynomial
Regression [39.29885444997579]
A major goal of this literature has been to compare different algorithms, such as gradient descent (GD) or gradient descent (SGD)
We demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of both GD and SGD.
arXiv Detail & Related papers (2020-11-04T20:10:31Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - On Riemannian Gradient-Based Methods for Minimax Problems [24.199289678553896]
We propose a class of Riemanian-based methods to solve minimax problems.
We prove that our RGDA has a sample complexity of $tildeO(kappa4eps-3)$.
We also show that our Acc-RSGDA achieves a lower sample complexity on $tildeO(kappa4eps-3)$.
arXiv Detail & Related papers (2020-10-13T00:54:00Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - On the Almost Sure Convergence of Stochastic Gradient Descent in
Non-Convex Problems [75.58134963501094]
This paper analyzes the trajectories of gradient descent (SGD)
We show that SGD avoids saddle points/manifolds with $1$ for strict step-size policies.
arXiv Detail & Related papers (2020-06-19T14:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.