Ensemble and Mixture-of-Experts DeepONets For Operator Learning
- URL: http://arxiv.org/abs/2405.11907v5
- Date: Sun, 16 Mar 2025 03:43:14 GMT
- Title: Ensemble and Mixture-of-Experts DeepONets For Operator Learning
- Authors: Ramansh Sharma, Varun Shankar,
- Abstract summary: We present a novel deep operator network (DeepONet) architecture for operator learning.<n>The ensemble DeepONet allows for enriching the trunk network of a single DeepONet with multiple distinct trunk networks.<n>We also present a spatial mixture-of-experts (MoE) DeepONet trunk network architecture.
- Score: 4.604003661048267
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a novel deep operator network (DeepONet) architecture for operator learning, the ensemble DeepONet, that allows for enriching the trunk network of a single DeepONet with multiple distinct trunk networks. This trunk enrichment allows for greater expressivity and generalization capabilities over a range of operator learning problems. We also present a spatial mixture-of-experts (MoE) DeepONet trunk network architecture that utilizes a partition-of-unity (PoU) approximation to promote spatial locality and model sparsity in the operator learning problem. We first prove that both the ensemble and PoU-MoE DeepONets are universal approximators. We then demonstrate that ensemble DeepONets containing a trunk ensemble of a standard trunk, the PoU-MoE trunk, and/or a proper orthogonal decomposition (POD) trunk can achieve 2-4x lower relative $\ell_2$ errors than standard DeepONets and POD-DeepONets on both standard and challenging new operator learning problems involving partial differential equations (PDEs) in two and three dimensions. Our new PoU-MoE formulation provides a natural way to incorporate spatial locality and model sparsity into any neural network architecture, while our new ensemble DeepONet provides a powerful and general framework for incorporating basis enrichment in scientific machine learning architectures for operator learning.
Related papers
- Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection [10.639484582036088]
We propose Dynamic-DINO, which extends Grounding DINO 1.5 Edge from a dense model to a dynamic inference framework via an efficient MoE-Tuning strategy.<n>We also design a decomposition mechanism to decompose the Feed-Forward Network (FFN) of base model into multiple smaller expert networks.<n>Experiments show that, pretrained with merely 1.56M open-source data, Dynamic-DINO outperforms Grounding DINO 1.5 Edge, pretrained on the private Grounding20M dataset.
arXiv Detail & Related papers (2025-07-23T11:51:06Z) - Nonlinear model reduction for operator learning [1.0364028373854508]
We propose an efficient framework that combines neural networks with kernel principal component analysis (KPCA) for operator learning.
Our results demonstrate the superior performance of KPCA-DeepONet over POD-DeepONet.
arXiv Detail & Related papers (2024-03-27T16:24:26Z) - Depth Separation in Norm-Bounded Infinite-Width Neural Networks [55.21840159087921]
We study depth separation in infinite-width neural networks, where complexity is controlled by the overall squared $ell$-norm of the weights.
We show that there are functions that are learnable with sample complexity in the input dimension by norm-controlled depth-3 ReLU networks, yet are not learnable with sub-exponential sample complexity by norm-controlled depth-2 ReLU networks.
arXiv Detail & Related papers (2024-02-13T21:26:38Z) - On the non-universality of deep learning: quantifying the cost of
symmetry [24.86176236641865]
We prove computational limitations for learning with neural networks trained by noisy gradient descent (GD)
We characterize functions that fully-connected networks can weak-learn on the binary hypercube and unit sphere.
Our techniques extend to gradient descent (SGD), for which we show nontrivial results for learning with fully-connected networks.
arXiv Detail & Related papers (2022-08-05T11:54:52Z) - Multifidelity deep neural operators for efficient learning of partial
differential equations with application to fast inverse design of nanoscale
heat transport [2.512625172084287]
We develop a multifidelity neural operator based on a deep operator network (DeepONet)
A multifidelity DeepONet significantly reduces the required amount of high-fidelity data and achieves one order of magnitude smaller error when using the same amount of high-fidelity data.
We apply a multifidelity DeepONet to learn the phonon Boltzmann transport equation (BTE), a framework to compute nanoscale heat transport.
arXiv Detail & Related papers (2022-04-14T01:01:24Z) - Enhanced DeepONet for Modeling Partial Differential Operators
Considering Multiple Input Functions [5.819397109258169]
A deep network operator (DeepONet) was proposed to model the general non-linear continuous operators for partial differential equations (PDE)
Existing DeepONet can only accept one input function, which limits its application.
We propose new Enhanced DeepONet or EDeepONet high-level neural network structure, in which two input functions are represented by two branch sub-networks.
Our numerical results on modeling two partial differential equation examples shows that the proposed enhanced DeepONet is about 7X-17X or about one order of magnitude more accurate than the fully connected neural network.
arXiv Detail & Related papers (2022-02-17T23:58:23Z) - S2R-DepthNet: Learning a Generalizable Depth-specific Structural
Representation [63.58891781246175]
Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes.
We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information.
Our S2R-DepthNet can be well generalized to unseen real-world data directly even though it is only trained on synthetic data.
arXiv Detail & Related papers (2021-04-02T03:55:41Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Multiple Expert Brainstorming for Domain Adaptive Person
Re-identification [140.3998019639158]
We propose a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID.
MEB-Net adopts a mutual learning strategy, where multiple networks with different architectures are pre-trained within a source domain.
Experiments on large-scale datasets demonstrate the superior performance of MEB-Net over the state-of-the-arts.
arXiv Detail & Related papers (2020-07-03T08:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.