Related papers: A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

URL: http://arxiv.org/abs/2406.02917v1
Date: Wed, 5 Jun 2024 04:10:36 GMT
Title: A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks
Authors: Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, George Em Karniadakis,
Abstract summary: Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to the standard representation model. Herein, we employ KANs to construct machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems.
Score: 8.573300153709358
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator networks (DeepONets), which are based on the standard MLP representation. We find that although the original KANs based on the B-splines parameterization lack accuracy and efficiency, modified versions based on low-order orthogonal polynomials have comparable performance to PINNs and DeepONet although they still lack robustness as they may diverge for different random seeds or higher order orthogonal polynomials. We visualize their corresponding loss landscapes and analyze their learning dynamics using information bottleneck theory. Our study follows the FAIR principles so that other researchers can use our benchmarks to further advance this emerging topic.

Related papers

Fourier Neural Operators for Non-Markovian Processes:Approximation Theorems and Experiments [2.84475965465923]
This paper introduces an operator-based neural network, the mirror-padded neural operator (MFNO)<n>MFNO extends the standard Fourier neural operator (FNO) by incorporating mirror padding, enabling it to handle non-periodic inputs.<n>We rigorously prove that MFNOs can approximate solutions of path-dependent differential equations and transformations of fractional Brownian motions to an arbitrary degree of accuracy.
arXiv Detail & Related papers (2025-07-23T19:30:34Z)
MLPs and KANs for data-driven learning in physical problems: A performance comparison [4.252092276491948]
Kolmogorov-Layer Networks (KANs) are an alternative to traditional neural networks represented by Multi-Arnold Perceptrons (MLPs) While showing promise, their performance advantages in physics-based problems remain largely unexplored. This suggests that KANs are a promising choice, offering a balance of efficiency and accuracy in applications involving physical systems.
arXiv Detail & Related papers (2025-04-15T17:13:42Z)
Physics-informed KAN PointNet: Deep learning for simultaneous solutions to inverse problems in incompressible flow on numerous irregular geometries [4.548755617115688]
Physics-informed PointNet (PIPN) was introduced to address this limitation for PINNs. PI-KAN-PointNet enables the simultaneous solution of an inverse problem over multiple irregular geometries within a single training run. Our findings indicate that a physics-informed PointNet model employing layers as the encoder and KAN layers as the decoder represents the optimal configuration among all models investigated.
arXiv Detail & Related papers (2025-04-08T12:31:57Z)
PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks [47.947045173329315]
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures. KANs offer a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as CNNs, RecurrentReduced Networks (RNNs) and Transformers. This paper introduces PRKANs, which employ several methods to reduce the parameter count in layers, making them comparable to Neural M layers.
arXiv Detail & Related papers (2025-01-13T03:07:39Z)
A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks [43.70716358136333]
Kolmogorov- Networks (KAN) are based on a fundamentally different mathematical framework. KANs address several major issues insio, such as forgetting in continual learning scenarios. We extend the investigation by evaluating the performance of KANs in continual learning tasks within computer vision.
arXiv Detail & Related papers (2024-09-20T14:49:21Z)
Component Fourier Neural Operator for Singularly Perturbed Differential Equations [3.9482103923304877]
Solving Singularly Perturbed Differential Equations (SPDEs) poses computational challenges arising from the rapid transitions in their solutions within thin regions. In this manuscript, we introduce Component Fourier Neural Operator (ComFNO), an innovative operator learning method that builds upon Fourier Neural Operator (FNO) Our approach is not limited to FNO and can be applied to other neural network frameworks, such as Deep Operator Network (DeepONet)
arXiv Detail & Related papers (2024-09-07T09:40:51Z)
A practical existence theorem for reduced order models based on convolutional autoencoders [0.4604003661048266]
Deep learning has gained increasing popularity in the fields of Partial Differential Equations (PDEs) and Reduced Order Modeling (ROM) CNN-based autoencoders have proven extremely effective, outperforming established techniques, such as the reduced basis method, when dealing with complex nonlinear problems. We provide a new practical existence theorem for CNN-based autoencoders when the parameter-to-solution map is holomorphic.
arXiv Detail & Related papers (2024-02-01T09:01:58Z)
RBF-MGN:Solving spatiotemporal PDEs with Physics-informed Graph Neural Network [4.425915683879297]
We propose a novel framework based on graph neural networks (GNNs) and radial basis function finite difference (RBF-FD) RBF-FD is used to construct a high-precision difference format of the differential equations to guide model training. We illustrate the generalizability, accuracy, and efficiency of the proposed algorithms on different PDE parameters.
arXiv Detail & Related papers (2022-12-06T10:08:02Z)
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations [64.78260098263489]
In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs) We show that PINNs eventually fail to produce correct solutions to these benchmarks as their complexity increases. We identify several reasons why this may be the case, including insufficient network capacity, poor conditioning of the ODEs, and high local curvature, as measured by the Laplacian of the PINN loss.
arXiv Detail & Related papers (2022-10-14T15:01:32Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Kernel-Based Smoothness Analysis of Residual Networks [85.20737467304994]
Residual networks (ResNets) stand out among these powerful modern architectures. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoothers than gradients.
arXiv Detail & Related papers (2020-09-21T16:32:04Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
A deep learning framework for solution and discovery in solid mechanics [1.4699455652461721]
We present the application of a class of deep learning, known as Physics Informed Neural Networks (PINN), to learning and discovery in solid mechanics. We explain how to incorporate the momentum balance and elasticity relations into PINN, and explore in detail the application to linear elasticity.
arXiv Detail & Related papers (2020-02-14T08:24:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.