Separable Operator Networks
- URL: http://arxiv.org/abs/2407.11253v3
- Date: Thu, 05 Dec 2024 22:12:02 GMT
- Title: Separable Operator Networks
- Authors: Xinling Yu, Sean Hooten, Ziyue Liu, Yequan Zhao, Marco Fiorentino, Thomas Van Vaerenbergh, Zheng Zhang,
- Abstract summary: Operator learning has become a powerful tool in machine learning for modeling complex physical systems governed by partial differential equations (PDEs)
We introduce Separable Operator Networks (SepONet), a novel framework that significantly enhances the efficiency of physics-informed operator learning.
SepONet uses independent trunk networks to learn basis functions separately for different coordinate axes, enabling faster and more memory-efficient training.
- Score: 4.688862638563124
- License:
- Abstract: Operator learning has become a powerful tool in machine learning for modeling complex physical systems governed by partial differential equations (PDEs). Although Deep Operator Networks (DeepONet) show promise, they require extensive data acquisition. Physics-informed DeepONets (PI-DeepONet) mitigate data scarcity but suffer from inefficient training processes. We introduce Separable Operator Networks (SepONet), a novel framework that significantly enhances the efficiency of physics-informed operator learning. SepONet uses independent trunk networks to learn basis functions separately for different coordinate axes, enabling faster and more memory-efficient training via forward-mode automatic differentiation. We provide a universal approximation theorem for SepONet proving the existence of a separable approximation to any nonlinear continuous operator. Then, we comprehensively benchmark its representational capacity and computational performance against PI-DeepONet. Our results demonstrate SepONet's superior performance across various nonlinear and inseparable PDEs, with SepONet's advantages increasing with problem complexity, dimension, and scale. For 1D time-dependent PDEs, SepONet achieves up to 112x faster training and 82x reduction in GPU memory usage compared to PI-DeepONet, while maintaining comparable accuracy. For the 2D time-dependent nonlinear diffusion equation, SepONet efficiently handles the complexity, achieving a 6.44% mean relative $\ell_{2}$ test error, while PI-DeepONet fails due to memory constraints. This work paves the way for extreme-scale learning of continuous mappings between infinite-dimensional function spaces. Open source code is available at \url{https://github.com/HewlettPackard/separable-operator-networks}.
Related papers
- Physics-Informed Latent Neural Operator for Real-time Predictions of Complex Physical Systems [0.0]
Deep operator network (DeepONet) has shown great promise as a surrogate model for systems governed by partial differential equations (PDEs)
This work introduces PI-Latent-NO, a physics-informed latent operator learning framework that overcomes limitations.
arXiv Detail & Related papers (2025-01-14T20:38:30Z) - FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Separable DeepONet: Breaking the Curse of Dimensionality in Physics-Informed Machine Learning [0.0]
In the absence of labeled datasets, we utilize the PDE residual loss to learn the physical system, an approach known as physics-informed DeepONet.
This method faces significant computational challenges, primarily due to the curse of dimensionality, as the computational cost increases exponentially with finer discretization.
We introduce the Separable DeepONet framework to address these challenges and improve scalability for high-dimensional PDEs.
arXiv Detail & Related papers (2024-07-21T16:33:56Z) - Zero Coordinate Shift: Whetted Automatic Differentiation for Physics-informed Operator Learning [1.024113475677323]
We present a novel and lightweight algorithm to conduct automatic differentiation (AD) for physics-informed operator learning.
Instead of making all sampled coordinates as leaf variables, ZCS introduces only one scalar-valued leaf variable for each spatial or temporal dimension.
It has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions.
arXiv Detail & Related papers (2023-11-01T21:28:24Z) - Training Deep Surrogate Models with Large Scale Online Learning [48.7576911714538]
Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs.
Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training.
It proposes an open source online training framework for deep surrogate models.
arXiv Detail & Related papers (2023-06-28T12:02:27Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Solving High-Dimensional PDEs with Latent Spectral Models [74.1011309005488]
We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs.
Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space.
LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks.
arXiv Detail & Related papers (2023-01-30T04:58:40Z) - Improved architectures and training algorithms for deep operator
networks [0.0]
Operator learning techniques have emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces.
We analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory.
arXiv Detail & Related papers (2021-10-04T18:34:41Z) - Physics-Based Deep Learning for Fiber-Optic Communication Systems [10.630021520220653]
We propose a new machine-learning approach for fiber-optic communication systems governed by the nonlinear Schr"odinger equation (NLSE)
Our main observation is that the popular split-step method (SSM) for numerically solving the NLSE has essentially the same functional form as a deep multi-layer neural network.
We exploit this connection by parameterizing the SSM and viewing the linear steps as general linear functions, similar to the weight matrices in a neural network.
arXiv Detail & Related papers (2020-10-27T12:55:23Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.