Related papers: Axial Neural Networks for Dimension-Free Foundation Models

Axial Neural Networks for Dimension-Free Foundation Models

URL: http://arxiv.org/abs/2510.13665v2
Date: Fri, 24 Oct 2025 12:06:35 GMT
Title: Axial Neural Networks for Dimension-Free Foundation Models
Authors: Hyunsu Kim, Jonggeon Park, Joan Bruna, Hongseok Yang, Juho Lee,
Abstract summary: Training foundation models on physics data poses a unique challenge due to varying dimensionalities across different systems.<n>Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies.<n>We propose a dimension-agnostic neural network architecture, inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks.
Score: 48.074109255029896
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of foundation models in AI has significantly advanced general-purpose learning, enabling remarkable capabilities in zero-shot inference and in-context learning. However, training such models on physics data, including solutions to partial differential equations (PDEs), poses a unique challenge due to varying dimensionalities across different systems. Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies. To address this, we propose a dimension-agnostic neural network architecture, the Axial Neural Network (XNN), inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks. XNN generalizes across varying tensor dimensions while maintaining computational efficiency. We convert existing PDE foundation models into axial neural networks and evaluate their performance across three training scenarios: training from scratch, pretraining on multiple PDEs, and fine-tuning on a single PDE. Our experiments show that XNNs perform competitively with original models and exhibit superior generalization to unseen dimensions, highlighting the importance of multidimensional pretraining for foundation models.

Related papers

Learning Data-Efficient and Generalizable Neural Operators via Fundamental Physics Knowledge [8.269904705399474]
Recent advances in machine learning have enabled neural operators to serve as powerful surrogates for modeling the evolution of physical systems.<n>We propose a multiphysics training framework that jointly learns from both the original PDEs and their simplified basic forms.<n>Our framework enhances data efficiency, reduces predictive errors, and improves out-of-distribution (OOD) generalization.
arXiv Detail & Related papers (2026-02-16T20:45:10Z)
Expanding the Chaos: Neural Operator for Stochastic (Partial) Differential Equations [65.80144621950981]
We build on Wiener chaos expansions (WCE) to design neural operator (NO) architectures for SPDEs and SDEs.<n>We show that WCE-based neural operators provide a practical and scalable way to learn SDE/SPDE solution operators.
arXiv Detail & Related papers (2026-01-03T00:59:25Z)
Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z)
Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models [51.85815025140659]
Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data.<n>In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large gives rise to novel and sometimes counterintuitive behaviors.<n>This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models.
arXiv Detail & Related papers (2025-06-16T06:54:08Z)
Towards a Foundation Model for Physics-Informed Neural Networks: Multi-PDE Learning with Active Sampling [0.0]
Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving partial differential equations (PDEs) by embedding physical laws into neural network training.<n>In this work, we explore the potential of a foundation PINN model capable of solving multiple PDEs within a unified architecture.
arXiv Detail & Related papers (2025-02-11T10:12:28Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Deep Learning-based surrogate models for parametrized PDEs: handling geometric variability through graph neural networks [0.0]
This work explores the potential usage of graph neural networks (GNNs) for the simulation of time-dependent PDEs. We propose a systematic strategy to build surrogate models based on a data-driven time-stepping scheme. We show that GNNs can provide a valid alternative to traditional surrogate models in terms of computational efficiency and generalization to new scenarios.
arXiv Detail & Related papers (2023-08-03T08:14:28Z)
Learning Neural Constitutive Laws From Motion Observations for Generalizable PDE Dynamics [97.38308257547186]
Many NN approaches learn an end-to-end model that implicitly models both the governing PDE and material models. We argue that the governing PDEs are often well-known and should be explicitly enforced rather than learned. We introduce a new framework termed "Neural Constitutive Laws" (NCLaw) which utilizes a network architecture that strictly guarantees standard priors.
arXiv Detail & Related papers (2023-04-27T17:42:24Z)
Connections between Numerical Algorithms for PDEs and Neural Networks [8.660429288575369]
We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural networks. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks.
arXiv Detail & Related papers (2021-07-30T16:42:45Z)
SPINN: Sparse, Physics-based, and Interpretable Neural Networks for PDEs [0.0]
We introduce a class of Sparse, Physics-based, and Interpretable Neural Networks (SPINN) for solving ordinary and partial differential equations. By reinterpreting a traditional meshless representation of solutions of PDEs as a special sparse deep neural network, we develop a class of sparse neural network architectures that are interpretable.
arXiv Detail & Related papers (2021-02-25T17:45:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.