Related papers: Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach

Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach

URL: http://arxiv.org/abs/2508.20650v1
Date: Thu, 28 Aug 2025 10:53:00 GMT
Title: Self-Composing Neural Operators with Depth and Accuracy Scaling via Adaptive Train-and-Unroll Approach
Authors: Juncai He, Xinliang Liu, Jinchao Xu,
Abstract summary: We propose a novel framework to enhance the efficiency and accuracy of neural operators through self-composition.<n>Inspired by iterative methods in solving numerical partial differential equations (PDEs), we design a specific neural operator by repeatedly applying a single neural operator block.<n>We introduce an adaptive train-and-unroll approach, where the depth of the neural operator is gradually increased during training.
Score: 12.718377513965912
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we propose a novel framework to enhance the efficiency and accuracy of neural operators through self-composition, offering both theoretical guarantees and practical benefits. Inspired by iterative methods in solving numerical partial differential equations (PDEs), we design a specific neural operator by repeatedly applying a single neural operator block, we progressively deepen the model without explicitly adding new blocks, improving the model's capacity. To train these models efficiently, we introduce an adaptive train-and-unroll approach, where the depth of the neural operator is gradually increased during training. This approach reveals an accuracy scaling law with model depth and offers significant computational savings through our adaptive training strategy. Our architecture achieves state-of-the-art (SOTA) performance on standard benchmarks. We further demonstrate its efficacy on a challenging high-frequency ultrasound computed tomography (USCT) problem, where a multigrid-inspired backbone enables superior performance in resolving complex wave phenomena. The proposed framework provides a computationally tractable, accurate, and scalable solution for large-scale data-driven scientific machine learning applications.

Related papers

CAMP-HiVe: Cyclic Pair Merging based Efficient DNN Pruning with Hessian-Vector Approximation for Resource-Constrained Systems [3.343542849202802]
We introduce CAMP-HiVe, a cyclic pair merging-based pruning with Hessian Vector approximation.<n>Our experimental results demonstrate that our proposed method achieves significant reductions in computational requirements.<n>It outperforms the existing state-of-the-art neural pruning methods.
arXiv Detail & Related papers (2025-11-09T07:58:36Z)
Predictive Coding-based Deep Neural Network Fine-tuning for Computationally Efficient Domain Adaptation [5.013248430919224]
We propose a hybrid training methodology that enables efficient on-device domain adaptation.<n>The method begins with a deep neural network trained offline using Backpropagation to achieve high initial performance.<n> Predictive Coding is employed for online adaptation, allowing the model to recover accuracy lost due to shifts in the input data distribution.
arXiv Detail & Related papers (2025-09-24T16:03:27Z)
PMNO: A novel physics guided multi-step neural operator predictor for partial differential equations [23.04840527974364]
We propose a novel physics guided multi-step neural operator (PMNO) architecture to address challenges in long-horizon prediction of complex physical systems.<n>The PMNO framework replaces the single-step input with multi-step historical data in the forward pass and introduces an implicit time-stepping scheme during backpropagation.<n>We demonstrate the superior predictive performance of PMNO predictor across a diverse range of physical systems.
arXiv Detail & Related papers (2025-06-02T12:33:50Z)
TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training [91.8932638236073]
We introduce textbfTensorGRaD, a novel method that directly addresses the memory challenges associated with large-structured weights.<n>We show that sparseGRaD reduces total memory usage by over $50%$ while maintaining and sometimes even improving accuracy.
arXiv Detail & Related papers (2025-01-04T20:51:51Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Efficient Training of Deep Neural Operator Networks via Randomized Sampling [0.0]
We introduce a random sampling technique to be adopted the training of DeepONet.<n>We demonstrate substantial reductions in training time while achieving comparable or lower overall test errors relative to the traditional training approach.<n>Our results indicate that incorporating randomization in the trunk network inputs during training enhances the efficiency and robustness of DeepONet.
arXiv Detail & Related papers (2024-09-20T07:18:31Z)
Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.<n>This work considers AD in network flows using incomplete measurements.<n>We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.<n>Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z)
Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder. Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.