Related papers: NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient Descent

NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient Descent

URL: http://arxiv.org/abs/2003.08844v1
Date: Wed, 18 Mar 2020 04:44:05 GMT
Title: NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient Descent
Authors: Ali Anaissi, Basem Suleiman, Seid Miad Zandavi
Abstract summary: We propose a new efficient decomposition algorithm named NeCPD for non- efficient problem in multi-way online data based $(N)N. We further apply this method in the field of reallife monitoring using structural datasets.
Score: 1.0953917735844645
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-way data analysis has become an essential tool for capturing underlying structures in higher-order datasets stored in tensor $\mathcal{X} \in \mathbb{R} ^{I_1 \times \dots \times I_N} $. $CANDECOMP/PARAFAC$ (CP) decomposition has been extensively studied and applied to approximate $\mathcal{X}$ by $N$ loading matrices $A^{(1)}, \dots, A^{(N)}$ where $N$ represents the order of the tensor. We propose a new efficient CP decomposition solver named NeCPD for non-convex problem in multi-way online data based on stochastic gradient descent (SGD) algorithm. SGD is very useful in online setting since it allows us to update $\mathcal{X}^{(t+1)}$ in one single step. In terms of global convergence, it is well known that SGD stuck in many saddle points when it deals with non-convex problems. We study the Hessian matrix to identify theses saddle points, and then try to escape them using the perturbation approach which adds little noise to the gradient update step. We further apply Nesterov's Accelerated Gradient (NAG) method in SGD algorithm to optimally accelerate the convergence rate and compensate Hessian computational delay time per epoch. Experimental evaluation in the field of structural health monitoring using laboratory-based and real-life structural datasets show that our method provides more accurate results compared with existing online tensor analysis methods.

Related papers

Theoretical Framework for Tempered Fractional Gradient Descent: Application to Breast Cancer Classification [0.0]
This paper introduces Fractional Gradient Descent (TFGD), a novel optimization framework that synergizes fractional calculus with exponential tempering to enhance gradient-based learning.<n>TFGD addresses limitations by incorporating a tempered memory mechanism, where historical gradients are weighted by fractional coefficients $|w_j| = binomalphaj$ and exponentially decayed via a tempering parameter $lambda$.<n> Empirical validation on the Breast Cancer Wisconsin dataset demonstrates TFGD's superiority, achieving 98.25% test accuracy (vs. 92.11% for SGD) and 2$times$ faster convergence.
arXiv Detail & Related papers (2025-04-26T08:26:34Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Modified Step Size for Enhanced Stochastic Gradient Descent: Convergence and Experiments [0.0]
This paper introduces a novel approach to the performance of the gradient descent (SGD) algorithm by incorporating a modified decay step size based on $frac1sqrttt. The proposed step size integrates a logarithmic step term, leading to the selection of smaller values in the final iteration. To the effectiveness of our approach, we conducted numerical experiments on image classification tasks using the FashionMNIST, andARAR datasets.
arXiv Detail & Related papers (2023-09-03T19:21:59Z)
A Fast Parallel Tensor Decomposition with Optimal Stochastic Gradient Descent: an Application in Structural Damage Identification [1.536989504296526]
We propose a novel algorithm, FP-CPD, to parallelize the CANDECOMP/PARAFAC (CP) decomposition of a tensor $mathcalX in mathbbR I_1 times dots times I_N $.
arXiv Detail & Related papers (2021-11-04T05:17:07Z)
Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product. We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems. MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms [65.09383385484007]
Two timescale approximation (SA) has been widely used in value-based reinforcement learning algorithms. We study the non-asymptotic convergence rate of two timescale linear and nonlinear TDC and Greedy-GQ algorithms.
arXiv Detail & Related papers (2020-11-10T11:36:30Z)
Gradient-Based Empirical Risk Minimization using Local Polynomial Regression [39.29885444997579]
A major goal of this literature has been to compare different algorithms, such as gradient descent (GD) or gradient descent (SGD) We demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of both GD and SGD.
arXiv Detail & Related papers (2020-11-04T20:10:31Z)
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave. At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z)
On Riemannian Gradient-Based Methods for Minimax Problems [24.199289678553896]
We propose a class of Riemanian-based methods to solve minimax problems. We prove that our RGDA has a sample complexity of $tildeO(kappa4eps-3)$. We also show that our Acc-RSGDA achieves a lower sample complexity on $tildeO(kappa4eps-3)$.
arXiv Detail & Related papers (2020-10-13T00:54:00Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems [75.58134963501094]
This paper analyzes the trajectories of gradient descent (SGD) We show that SGD avoids saddle points/manifolds with $1$ for strict step-size policies.
arXiv Detail & Related papers (2020-06-19T14:11:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.