Related papers: Accelerated Algorithms for Nonlinear Matrix Decomposition with the ReLU function

Accelerated Algorithms for Nonlinear Matrix Decomposition with the ReLU function

URL: http://arxiv.org/abs/2305.08687v1
Date: Mon, 15 May 2023 14:43:27 GMT
Title: Accelerated Algorithms for Nonlinear Matrix Decomposition with the ReLU function
Authors: Giovanni Seraghiti, Atharva Awari, Arnaud Vandaele, Margherita Porcelli, Nicolas Gillis
Abstract summary: We focus on the case where $f(cdot) = max(0, cdot)$, the rectified unit (ReLU) non-linear activation. We introduce two new algorithms: (1) aggressive accelerated NMD (A-NMD) which uses an adaptive Nesterov extrapolation to accelerate an existing algorithm, and (2) three-block NMD (3B-NMD) which parametrizes $Theta = WH$ and leads to a significant reduction in the computational cost.
Score: 17.17890385027466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study the following nonlinear matrix decomposition (NMD) problem: given a sparse nonnegative matrix $X$, find a low-rank matrix $\Theta$ such that $X \approx f(\Theta)$, where $f$ is an element-wise nonlinear function. We focus on the case where $f(\cdot) = \max(0, \cdot)$, the rectified unit (ReLU) non-linear activation. We refer to the corresponding problem as ReLU-NMD. We first provide a brief overview of the existing approaches that were developed to tackle ReLU-NMD. Then we introduce two new algorithms: (1) aggressive accelerated NMD (A-NMD) which uses an adaptive Nesterov extrapolation to accelerate an existing algorithm, and (2) three-block NMD (3B-NMD) which parametrizes $\Theta = WH$ and leads to a significant reduction in the computational cost. We also propose an effective initialization strategy based on the nuclear norm as a proxy for the rank function. We illustrate the effectiveness of the proposed algorithms (available on gitlab) on synthetic and real-world data sets.

Related papers

Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions [10.559775618998044]
We present an algorithm based on the alternating direction method of multipliers (ADMM) for solving nonlinear matrix decompositions (NMD)<n>We illustrate the applicability, efficiency, and adaptability of the approach on real-world datasets, highlighting its potential for a broad range of applications.
arXiv Detail & Related papers (2025-12-19T11:40:06Z)
Approaching Optimality for Solving Dense Linear Systems with Low-Rank Structure [16.324043075920564]
We provide new high-accuracy randomized algorithms for solving linear systems and regression problems.<n>Our algorithms nearly-match a natural complexity limit under dense inputs for these problems.<n>We show how to obtain these running times even under the weaker assumption that all but $k$ of the singular values have a bounded generalized mean.
arXiv Detail & Related papers (2025-07-15T20:48:30Z)
An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU function [12.059960289576154]
We show that the two formulations may yield different low-rank $Theta in particular. We also consider another alternative model, called 3B-ReLUNMD, which parameterizes $Theta$.
arXiv Detail & Related papers (2025-03-31T08:27:41Z)
Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent [17.73720530889677]
Preconditioning methods have been proposed to accelerate the convergence of matrix sensing problem. We propose an alternating preconditioned descent (APGD) algorithm, which alternately updates the two factor parameter. We theoretically prove that APGD achieves near-optimal convergence at a linear rate, starting from arbitrary randoms.
arXiv Detail & Related papers (2025-02-01T15:44:39Z)
A Momentum Accelerated Algorithm for ReLU-based Nonlinear Matrix Decomposition [0.0]
We propose a Tikhonov regularized ReLU-NMD model, referred to as ReLU-NMD-T. We introduce a momentum accelerated algorithm for handling the ReLU-NMD-T model. Our numerical experiments on real-world datasets show the effectiveness of the proposed model and algorithm.
arXiv Detail & Related papers (2024-02-04T10:43:35Z)
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems [61.85150061213987]
We study the generalized low-rank matrix bandit problem, proposed in citelu2021low under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms, we first propose the G-ESTT framework. We show that G-ESTT can achieve the $tildeO(sqrt(d_1+d_2)3/2Mr3/2T)$ bound of regret while G-ESTS can achineve the $tildeO
arXiv Detail & Related papers (2024-01-14T14:14:19Z)
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback [30.23951525723659]
We study the low-rank MDPs with adversarially changed losses in the full-information feedback setting. We propose a policy optimization-based algorithm POLO, and we prove that it attains the $widetildeO(Kfrac56Afrac12dln (1+M)/ (1-gamma)2)$ regret guarantee.
arXiv Detail & Related papers (2023-11-14T03:12:43Z)
A Doubly Robust Approach to Sparse Reinforcement Learning [19.68978899041642]
We propose a new regret algorithm for episodic sparse linear Markov decision process (SMDP) The proposed algorithm is $tildeO(sigma-1_min s_star H sqrtN)$, where $sigma_min$ denotes the restrictive minimum eigenvalue of the average Gram matrix of feature vectors.
arXiv Detail & Related papers (2023-10-23T18:52:17Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes. This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping [99.59319332864129]
In this paper, we study reinforcement learning for discounted Decision (MDP) We propose a novel algorithm that makes use of the feature mapping and obtains a $tilde O(dsqrtT/ (1-gamma)2)$ regret. Our upper and lower bound results together suggest that the proposed reinforcement learning algorithm is near-optimal up to a $ (1-gamma)-0.5$ factor.
arXiv Detail & Related papers (2020-06-23T17:08:54Z)
Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$. We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.