Related papers: RandONet: Shallow-Networks with Random Projections for learning linear and nonlinear operators

RandONet: Shallow-Networks with Random Projections for learning linear and nonlinear operators

URL: http://arxiv.org/abs/2406.05470v1
Date: Sat, 8 Jun 2024 13:20:48 GMT
Title: RandONet: Shallow-Networks with Random Projections for learning linear and nonlinear operators
Authors: Gianluca Fabiani, Ioannis G. Kevrekidis, Constantinos Siettos, Athanasios N. Yannacopoulos,
Abstract summary: We present Random Projection-based Operator Networks (RandONets) RandONets are shallow networks with random projections that learn linear and nonlinear operators. We show, that for this particular task, RandONets outperform, both in terms of numerical approximation accuracy and computational cost, the vanilla" DeepOnets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Operator Networks (DeepOnets) have revolutionized the domain of scientific machine learning for the solution of the inverse problem for dynamical systems. However, their implementation necessitates optimizing a high-dimensional space of parameters and hyperparameters. This fact, along with the requirement of substantial computational resources, poses a barrier to achieving high numerical accuracy. Here, inpsired by DeepONets and to address the above challenges, we present Random Projection-based Operator Networks (RandONets): shallow networks with random projections that learn linear and nonlinear operators. The implementation of RandONets involves: (a) incorporating random bases, thus enabling the use of shallow neural networks with a single hidden layer, where the only unknowns are the output weights of the network's weighted inner product; this reduces dramatically the dimensionality of the parameter space; and, based on this, (b) using established least-squares solvers (e.g., Tikhonov regularization and preconditioned QR decomposition) that offer superior numerical approximation properties compared to other optimization techniques used in deep-learning. In this work, we prove the universal approximation accuracy of RandONets for approximating nonlinear operators and demonstrate their efficiency in approximating linear nonlinear evolution operators (right-hand-sides (RHS)) with a focus on PDEs. We show, that for this particular task, RandONets outperform, both in terms of numerical approximation accuracy and computational cost, the ``vanilla" DeepOnets.

Related papers

From Overfitting to Reliability: Introducing the Hierarchical Approximate Bayesian Neural Network [3.632251954989679]
HABNN is a novel approach that uses a Gaussian-inverse-Wishart distribution as a hyperprior of the network's weights.<n>Results indicate that HABNN not only matches but often outperforms state-of-the-art models.
arXiv Detail & Related papers (2025-12-15T09:08:42Z)
Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z)
Physics-informed neural networks for high-dimensional solutions and snaking bifurcations in nonlinear lattices [0.0]
This paper introduces a framework based on physics-informed neural networks (PINNs) for addressing key challenges in nonlinear lattices.<n>We first employ PINNs to approximate solutions of nonlinear systems arising from lattice models, using the Levenberg-Marquardt algorithm.<n>We then extend the method by coupling PINNs with a continuation approach to compute snaking bifurcation diagrams.<n>For linear stability analysis, we adapt PINNs to compute eigenvectors, introducing output constraints to enforce positivity, in line with Sturm-Liouville theory.
arXiv Detail & Related papers (2025-07-13T20:41:55Z)
Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models [51.85815025140659]
Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data.<n>In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large gives rise to novel and sometimes counterintuitive behaviors.<n>This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models.
arXiv Detail & Related papers (2025-06-16T06:54:08Z)
DeepONet Augmented by Randomized Neural Networks for Efficient Operator Learning in PDEs [5.84093922354671]
We propose RaNN-DeepONets, a hybrid architecture designed to balance accuracy and efficiency. RaNN-DeepONets achieves comparable accuracy while reducing computational costs by orders of magnitude. These results highlight the potential of RaNN-DeepONets as an efficient alternative for operator learning in PDE-based systems.
arXiv Detail & Related papers (2025-03-01T03:05:29Z)
Deep Recurrent Stochastic Configuration Networks for Modelling Nonlinear Dynamic Systems [3.8719670789415925]
This paper proposes a novel deep reservoir computing framework, termed deep recurrent configuration network (DeepRSCN) DeepRSCNs are incrementally constructed, with all reservoir nodes directly linked to the final output. Given a set of training samples, DeepRSCNs can quickly generate learning representations, which consist of random basis functions with cascaded input readout weights.
arXiv Detail & Related papers (2024-10-28T10:33:15Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs. We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z)
Learning in latent spaces improves the predictive accuracy of deep neural operators [0.0]
L-DeepONet is an extension of standard DeepONet, which leverages latent representations of high-dimensional PDE input and output functions identified with suitable autoencoders. We show that L-DeepONet outperforms the standard approach in terms of both accuracy and computational efficiency across diverse time-dependent PDEs.
arXiv Detail & Related papers (2023-04-15T17:13:09Z)
Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We show that linear networks make provably optimal predictions at infinite depth. We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z)
Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources. The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z)
A novel Deep Neural Network architecture for non-linear system identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification. Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function) This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z)
Physics-aware deep neural networks for surrogate modeling of turbulent natural convection [0.0]
We investigate the use of PINNs surrogate modeling for turbulent Rayleigh-B'enard convection flows. We show how it comes to play as a regularization close to the training boundaries which are zones of poor accuracy for standard PINNs. The predictive accuracy of the surrogate over the entire half a billion DNS coordinates yields errors for all flow variables ranging between [0.3% -- 4%] in the relative L 2 norm.
arXiv Detail & Related papers (2021-03-05T09:48:57Z)
Joint Deep Reinforcement Learning and Unfolding: Beam Selection and Precoding for mmWave Multiuser MIMO with Lens Arrays [54.43962058166702]
millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays have received great attention. In this work, we investigate the joint design of a beam precoding matrix for mmWave MU-MIMO systems with DLA.
arXiv Detail & Related papers (2021-01-05T03:55:04Z)
Learning to Beamform in Heterogeneous Massive MIMO Networks [48.62625893368218]
It is well-known problem of finding the optimal beamformers in massive multiple-input multiple-output (MIMO) networks. We propose a novel deep learning based paper algorithm to address this problem.
arXiv Detail & Related papers (2020-11-08T12:48:06Z)
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks [20.44438519046223]
We show that wide neural networks satisfy the PL$*$ condition, which explains the (S)GD convergence to a global minimum. We show that wide neural networks satisfy the PL$*$ condition, which explains the (S)GD convergence to a global minimum.
arXiv Detail & Related papers (2020-02-29T17:18:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.