Related papers: Low-rank adaptive physics-informed HyperDeepONets for solving differential equations

Low-rank adaptive physics-informed HyperDeepONets for solving differential equations

URL: http://arxiv.org/abs/2507.18346v1
Date: Thu, 24 Jul 2025 12:19:25 GMT
Title: Low-rank adaptive physics-informed HyperDeepONets for solving differential equations
Authors: Etienne Zeudong, Elsa Cardoso-Bihlo, Alex Bihlo,
Abstract summary: HyperDeepONets were introduced in Lee, Cho and Hwang as an alternative architecture for operator learning.<n> PI-LoRA-HyperDeepONets leverage low-rank adaptation (LoRA) to reduce complexity by decomposing the hypernetwork's output layer weight matrix into two smaller low-rank matrices.<n>We show that PI-LoRA-HyperDeepONets achieve up to 70% reduction in parameters and consistently outperform regular HyperDeepONets in terms of predictive accuracy and generalization.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: HyperDeepONets were introduced in Lee, Cho and Hwang [ICLR, 2023] as an alternative architecture for operator learning, in which a hypernetwork generates the weights for the trunk net of a DeepONet. While this improves expressivity, it incurs high memory and computational costs due to the large number of output parameters required. In this work we introduce, in the physics-informed machine learning setting, a variation, PI-LoRA-HyperDeepONets, which leverage low-rank adaptation (LoRA) to reduce complexity by decomposing the hypernetwork's output layer weight matrix into two smaller low-rank matrices. This reduces the number of trainable parameters while introducing an extra regularization of the trunk networks' weights. Through extensive experiments on both ordinary and partial differential equations we show that PI-LoRA-HyperDeepONets achieve up to 70\% reduction in parameters and consistently outperform regular HyperDeepONets in terms of predictive accuracy and generalization.

Related papers

Search for Efficient Large Language Models [52.98684997131108]
Large Language Models (LLMs) have long held sway in the realms of artificial intelligence research. Weight pruning, quantization, and distillation have been embraced to compress LLMs, targeting memory reduction and inference acceleration. Most model compression techniques concentrate on weight optimization, overlooking the exploration of optimal architectures.
arXiv Detail & Related papers (2024-09-25T21:32:12Z)
RandONet: Shallow-Networks with Random Projections for learning linear and nonlinear operators [0.0]
We present Random Projection-based Operator Networks (RandONets) RandONets are shallow networks with random projections that learn linear and nonlinear operators. We show, that for this particular task, RandONets outperform, both in terms of numerical approximation accuracy and computational cost, the vanilla" DeepOnets.
arXiv Detail & Related papers (2024-06-08T13:20:48Z)
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation [12.07880147193174]
We show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of over parameterization without the computational burdens. We demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models.
arXiv Detail & Related papers (2024-06-06T14:29:49Z)
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach [49.744093838327615]
We provide a novel compression approach to wide and fully-connected emphdeep neural nets. Experiments on both synthetic and real-world data are conducted to support the advantages of the proposed compression scheme.
arXiv Detail & Related papers (2024-03-01T03:46:28Z)
HyperDeepONet: learning operator with complex target function space using the limited resources via hypernetwork [14.93012615797081]
This study proposes HyperDeepONet, which uses the expressive power of the hypernetwork to enable the learning of a complex operator with a smaller set of parameters. We analyze the complexity of DeepONet and conclude that HyperDeepONet needs relatively lower complexity to obtain the desired accuracy for operator learning.
arXiv Detail & Related papers (2023-12-26T08:28:46Z)
Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics [10.673414267895355]
We present a novel approach for compressing over parameterized models. Our algorithm improves the training efficiency by more than 2x, without compromising generalization.
arXiv Detail & Related papers (2023-11-08T23:57:03Z)
Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs [93.82811501035569]
We introduce a new data efficient and highly parallelizable operator learning approach with reduced memory requirement and better generalization. MG-TFNO scales to large resolutions by leveraging local and global structures of full-scale, real-world phenomena. We demonstrate superior performance on the turbulent Navier-Stokes equations where we achieve less than half the error with over 150x compression.
arXiv Detail & Related papers (2023-09-29T20:18:52Z)
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit [48.291961660957384]
We provide experiments demonstrating that residual architectures including convolutional ResNets and Vision Transformers exhibit transfer of optimal hyper parameters across width and depth. Using recent developments in the dynamical mean field theory (DMFT) description of neural network learning dynamics, we show that this parameterization of ResNets admits a well-defined feature learning joint infinite-width and infinite-depth limit.
arXiv Detail & Related papers (2023-09-28T17:20:50Z)
HyperLoRA for PDEs [7.898728380447954]
Physics-informed neural networks (PINNs) have been widely used to develop neural surrogates for solutions of Partial Differential Equations. A drawback of PINNs is that they have to be retrained with every change in initial-boundary conditions and PDE coefficients. The Hypernetwork, a model-based meta learning technique, takes in a parameterized task embedding as input and predicts the weights of PINN as output.
arXiv Detail & Related papers (2023-08-18T04:29:48Z)
Towards Size-Independent Generalization Bounds for Deep Operator Nets [0.28123958518740544]
This work aims to advance the theory of measuring out-of-sample error while training DeepONets.<n>For a class of DeepONets, we prove a bound on their Rademacher complexity which does not explicitly scale with the width of the nets involved.<n>We show how the Huber loss can be chosen so that for these DeepONet classes generalization error bounds can be obtained that have no explicit dependence on the size of the nets.
arXiv Detail & Related papers (2022-05-23T14:45:34Z)
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks [82.18396309806577]
We propose a novel activation quantizer, referred to as Dynamic Dual Trainable Bounds (DDTB) Our DDTB exhibits significant performance improvements in ultra-low precision. For example, our DDTB achieves a 0.70dB PSNR increase on Urban100 benchmark when quantizing EDSR to 2-bit and scaling up output images to x4.
arXiv Detail & Related papers (2022-03-08T04:26:18Z)
Hyperparameter Tuning is All You Need for LISTA [92.7008234085887]
Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network. We show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate. We call this new ultra-light weight network HyperLISTA.
arXiv Detail & Related papers (2021-10-29T16:35:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.