Related papers: Operator SVD with Neural Networks via Nested Low-Rank Approximation

Operator SVD with Neural Networks via Nested Low-Rank Approximation

URL: http://arxiv.org/abs/2402.03655v2
Date: Wed, 21 Aug 2024 05:09:53 GMT
Title: Operator SVD with Neural Networks via Nested Low-Rank Approximation
Authors: J. Jon Ryu, Xiangxiang Xu, H. S. Melihcan Erol, Yuheng Bu, Lizhong Zheng, Gregory W. Wornell,
Abstract summary: This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition. New techniques called emphnesting for learning the top-$L$ singular values and singular functions in the correct order. We demonstrate the effectiveness of the proposed framework for use cases in computational physics and machine learning.
Score: 19.562492156734653
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called \emph{nesting} for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning.

Related papers

Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms [1.4747234049753448]
In high-stakes engineering applications, optimization algorithms must come with provable provablecase guarantees over a mathematically defined class of problems.<n>We describe the class of algorithms that achieve linear convergence for classes of nonsmooth composite optimization problems.
arXiv Detail & Related papers (2025-08-01T16:56:42Z)
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality [52.906438147288256]
We show that our algorithm can identify the globally optimal reward and policy under certain neural network structures. This is the first IRL algorithm with a non-asymptotic convergence guarantee that provably achieves global optimality.
arXiv Detail & Related papers (2025-03-22T21:16:08Z)
A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand. We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z)
Explicit and Implicit Graduated Optimization in Deep Neural Networks [0.6906005491572401]
This paper experimentally evaluates the performance of an explicit graduated optimization algorithm with an optimal noise scheduling. In addition, it demonstrates its effectiveness through experiments on image classification tasks with ResNet architectures.
arXiv Detail & Related papers (2024-12-16T07:23:22Z)
Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints [12.414718831844041]
An infeasible retraction-based approach was proposed as an efficient alternative. This paper establishes a novel landing algorithm for smooth non-free component analysis using only a neuralian PL condition. Numerical experiments demonstrate that the landing algorithm performs on par with the state-the-art retraction-based methods with substantially reduced computational overhead.
arXiv Detail & Related papers (2024-12-07T16:02:27Z)
Analyzing and Enhancing the Backward-Pass Convergence of Unrolled Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form. This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method. A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming [10.06803520598035]
We develop a new approach to the problem called maximum optimality margin which the machine learning loss function by the optimality condition of the downstream optimization.
arXiv Detail & Related papers (2023-01-26T17:53:38Z)
Transformer-Based Learned Optimization [37.84626515073609]
We propose a new approach to learned optimization where we represent the computation's update step using a neural network. Our innovation is a new neural network architecture inspired by the classic BFGS algorithm. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms.
arXiv Detail & Related papers (2022-12-02T09:47:08Z)
Teaching Networks to Solve Optimization Problems [13.803078209630444]
We propose to replace the iterative solvers altogether with a trainable parametric set function. We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems.
arXiv Detail & Related papers (2022-02-08T19:13:13Z)
Implicit Rate-Constrained Optimization of Non-decomposable Objectives [37.43791617018009]
We consider a family of constrained optimization problems arising in machine learning. Our key idea is to formulate a rate-constrained optimization that expresses the threshold parameter as a function of the model parameters. We show how the resulting optimization problem can be solved using standard gradient based methods.
arXiv Detail & Related papers (2021-07-23T00:04:39Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update. We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z)
Learning Cost Functions for Optimal Transport [44.64193016158591]
Inverse optimal transport (OT) refers to the problem of learning the cost function for OT from observed transport plan or its samples. We derive an unconstrained convex optimization formulation of the inverse OT problem, which can be further augmented by any customizable regularization.
arXiv Detail & Related papers (2020-02-22T07:27:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.