Related papers: Finite Element Neural Network Interpolation. Part I: Interpretable and Adaptive Discretization for Solving PDEs

Finite Element Neural Network Interpolation. Part I: Interpretable and Adaptive Discretization for Solving PDEs

URL: http://arxiv.org/abs/2412.05719v1
Date: Sat, 07 Dec 2024 18:31:17 GMT
Title: Finite Element Neural Network Interpolation. Part I: Interpretable and Adaptive Discretization for Solving PDEs
Authors: Kateřina Škardová, Alexandre Daby-Seesaram, Martin Genet,
Abstract summary: We present a sparse neural network architecture extending previous work on Embedded Finite Element Neural Networks (EFENN)<n>Due to their mesh-based structure, EFENN requires significantly fewer trainable parameters than fully connected neural networks.<n>Our FENNI framework, within the EFENN framework, brings improvements to the HiDeNN approach.
Score: 44.99833362998488
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present the Finite Element Neural Network Interpolation (FENNI) framework, a sparse neural network architecture extending previous work on Embedded Finite Element Neural Networks (EFENN) introduced with the Hierarchical Deep-learning Neural Networks (HiDeNN). Due to their mesh-based structure, EFENN requires significantly fewer trainable parameters than fully connected neural networks, with individual weights and biases having a clear interpretation. Our FENNI framework, within the EFENN framework, brings improvements to the HiDeNN approach. First, we propose a reference element-based architecture where shape functions are defined on a reference element, enabling variability in interpolation functions and straightforward use of Gaussian quadrature rules for evaluating the loss function. Second, we propose a pragmatic multigrid training strategy based on the framework's interpretability. Third, HiDeNN's combined rh-adaptivity is extended from 1D to 2D, with a new Jacobian-based criterion for adding nodes combining h- and r-adaptivity. From a deep learning perspective, adaptive mesh behavior through rh-adaptivity and the multigrid approach correspond to transfer learning, enabling FENNI to optimize the network's architecture dynamically during training. The framework's capabilities are demonstrated on 1D and 2D test cases, where its accuracy and computational cost are compared against an analytical solution and a classical FEM solver. On these cases, the multigrid training strategy drastically improves the training stage's efficiency and robustness. Finally, we introduce a variational loss within the EFENN framework, showing that it performs as well as energy-based losses and outperforms residual-based losses. This framework is extended to surrogate modeling over the parametric space in Part II.

Related papers

Lattice-Based Pruning in Recurrent Neural Networks via Poset Modeling [0.0]
Recurrent neural networks (RNNs) are central to sequence modeling tasks, yet their high computational complexity poses challenges for scalability and real-time deployment. We introduce a novel framework that models RNNs as partially ordered sets (posets) and constructs corresponding dependency lattices. By identifying meet irreducible neurons, our lattice-based pruning algorithm selectively retains critical connections while eliminating redundant ones.
arXiv Detail & Related papers (2025-02-23T10:11:38Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture [0.0]
This work presents a two-stage adaptive framework for developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm.
arXiv Detail & Related papers (2022-11-13T09:51:16Z)
Knowledge Enhanced Neural Networks for relational domains [83.9217787335878]
We focus on a specific method, KENN, a Neural-Symbolic architecture that injects prior logical knowledge into a neural network. In this paper, we propose an extension of KENN for relational data.
arXiv Detail & Related papers (2022-05-31T13:00:34Z)
An Optimal Time Variable Learning Framework for Deep Neural Networks [0.0]
The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.
arXiv Detail & Related papers (2022-04-18T19:29:03Z)
Training multi-objective/multi-task collocation physics-informed neural network with student/teachers transfer learnings [0.0]
This paper presents a PINN training framework that employs pre-training steps and a net-to-net knowledge transfer algorithm. A multi-objective optimization algorithm may improve the performance of a physical-informed neural network with competing constraints.
arXiv Detail & Related papers (2021-07-24T00:43:17Z)
A Differential Game Theoretic Neural Optimizer for Training Residual Networks [29.82841891919951]
We propose a generalized Differential Dynamic Programming (DDP) neural architecture that accepts both residual connections and convolution layers. The resulting optimal control representation admits a gameoretic perspective, in which training residual networks can be interpreted as cooperative trajectory optimization on state-augmented systems.
arXiv Detail & Related papers (2020-07-17T10:19:17Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.