Learning and Generalizing Polynomials in Simulation Metamodeling
- URL: http://arxiv.org/abs/2307.10892v1
- Date: Thu, 20 Jul 2023 14:11:29 GMT
- Title: Learning and Generalizing Polynomials in Simulation Metamodeling
- Authors: Jesper Hauch, Christoffer Riis, Francisco C. Pereira
- Abstract summary: This paper collects and proposes multiplicative (MNN) architectures for approxing higher-order neural networks.
Experiments show that MNNs are better than baseline models at generalizing, and their performance in validation is true to their performance in out-of-distribution tests.
- Score: 7.41244589428771
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to learn polynomials and generalize out-of-distribution is
essential for simulation metamodels in many disciplines of engineering, where
the time step updates are described by polynomials. While feed forward neural
networks can fit any function, they cannot generalize out-of-distribution for
higher-order polynomials. Therefore, this paper collects and proposes
multiplicative neural network (MNN) architectures that are used as recursive
building blocks for approximating higher-order polynomials. Our experiments
show that MNNs are better than baseline models at generalizing, and their
performance in validation is true to their performance in out-of-distribution
tests. In addition to MNN architectures, a simulation metamodeling approach is
proposed for simulations with polynomial time step updates. For these
simulations, simulating a time interval can be performed in fewer steps by
increasing the step size, which entails approximating higher-order polynomials.
While our approach is compatible with any simulation with polynomial time step
updates, a demonstration is shown for an epidemiology simulation model, which
also shows the inductive bias in MNNs for learning and generalizing
higher-order polynomials.
Related papers
- Exploring the Potential of Polynomial Basis Functions in Kolmogorov-Arnold Networks: A Comparative Study of Different Groups of Polynomials [0.0]
This paper presents a survey of 18 distincts and their potential applications in Gottliebmogorov Network (KAN) models.
The study aims to investigate the suitability of theseDistincts as basis functions in KAN models for complex tasks like handwritten digit classification.
The performance metrics of the KAN models, including overall accuracy, Kappa, and F1 score, are evaluated and compared.
arXiv Detail & Related papers (2024-05-30T20:40:16Z) - Learning to Simulate: Generative Metamodeling via Quantile Regression [2.2518304637809714]
We propose a new metamodeling concept, called generative metamodeling, which aims to construct a "fast simulator of the simulator"
Once constructed, a generative metamodel can generate a large amount of random outputs as soon as the inputs are specified.
We propose a new algorithm -- quantile-regression-based generative metamodeling (QRGMM) -- and study its convergence and rate of convergence.
arXiv Detail & Related papers (2023-11-29T16:46:24Z) - PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels [23.99075223506133]
We show that attention with high degree can effectively replace softmax without sacrificing model quality.
We present a block-based algorithm to apply causal masking efficiently.
We validate PolySketchFormerAttention empirically by training language models capable of handling long contexts.
arXiv Detail & Related papers (2023-10-02T21:39:04Z) - On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models.
We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems.
The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients.
Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Realization Theory Of Recurrent Neural ODEs Using Polynomial System
Embeddings [0.802904964931021]
We show that neural ODE analogs of recurrent (RNN) and LongTerm Memory (ODE-LSTM) networks can be algorithmically embedded into the class of systems.
This embedding input-output behavior and can be extended to other DE-LSTM architectures.
We then use realization theory of systems to provide necessary conditions for an input-output to be realizable by an ODE-LSTM and sufficient conditions for minimality of such systems.
arXiv Detail & Related papers (2022-05-24T11:36:18Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - Machine learning for rapid discovery of laminar flow channel wall
modifications that enhance heat transfer [56.34005280792013]
We present a combination of accurate numerical simulations of arbitrary, flat, and non-flat channels and machine learning models predicting drag coefficient and Stanton number.
We show that convolutional neural networks (CNN) can accurately predict the target properties at a fraction of the time of numerical simulations.
arXiv Detail & Related papers (2021-01-19T16:14:02Z) - Tensor Networks for Probabilistic Sequence Modeling [7.846449972735859]
We use a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data.
We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample from a wide variety of conditional distributions.
Experiments on sequence modeling with synthetic and real text data show u-MPS outperforming a variety of baselines.
arXiv Detail & Related papers (2020-03-02T17:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.