Physically interpretable machine learning algorithm on multidimensional
non-linear fields
- URL: http://arxiv.org/abs/2005.13912v2
- Date: Thu, 10 Dec 2020 19:54:49 GMT
- Title: Physically interpretable machine learning algorithm on multidimensional
non-linear fields
- Authors: Rem-Sophia Mouradi, C\'edric Goeury, Olivier Thual, Fabrice Zaoui and
Pablo Tassi
- Abstract summary: Polynomial Chaos Expansion (PCE) has long been employed as a robust representation for probabilistic input-to-output mapping.
Dimensionality Reduction (DR) techniques are increasingly used for pattern recognition and data compression.
The goal of the present paper was to combine POD and PCE for a field-measurement-based forecasting.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In an ever-increasing interest for Machine Learning (ML) and a favorable data
development context, we here propose an original methodology for data-based
prediction of two-dimensional physical fields. Polynomial Chaos Expansion
(PCE), widely used in the Uncertainty Quantification community (UQ), has long
been employed as a robust representation for probabilistic input-to-output
mapping. It has been recently tested in a pure ML context, and shown to be as
powerful as classical ML techniques for point-wise prediction. Some advantages
are inherent to the method, such as its explicitness and adaptability to small
training sets, in addition to the associated probabilistic framework.
Simultaneously, Dimensionality Reduction (DR) techniques are increasingly used
for pattern recognition and data compression and have gained interest due to
improved data quality. In this study, the interest of Proper Orthogonal
Decomposition (POD) for the construction of a statistical predictive model is
demonstrated. Both POD and PCE have amply proved their worth in their
respective frameworks. The goal of the present paper was to combine them for a
field-measurement-based forecasting. The described steps are also useful to
analyze the data. Some challenging issues encountered when using
multidimensional field measurements are addressed, for example when dealing
with few data. The POD-PCE coupling methodology is presented, with particular
focus on input data characteristics and training-set choice. A simple
methodology for evaluating the importance of each physical parameter is
proposed for the PCE model and extended to the POD-PCE coupling.
Related papers
- You are out of context! [0.0]
New data can act as forces stretching, compressing, or twisting the geometric relationships learned by a model.
We propose a novel drift detection methodology for machine learning (ML) models based on the concept of ''deformation'' in the vector space representation of data.
arXiv Detail & Related papers (2024-11-04T10:17:43Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Proximal Symmetric Non-negative Latent Factor Analysis: A Novel Approach
to Highly-Accurate Representation of Undirected Weighted Networks [2.1797442801107056]
Undirected Weighted Network (UWN) is commonly found in big data-related applications.
Existing models fail in either modeling its intrinsic symmetry or low-data density.
Proximal Symmetric Nonnegative Latent-factor-analysis model is proposed.
arXiv Detail & Related papers (2023-06-06T13:03:24Z) - PCENet: High Dimensional Surrogate Modeling for Learning Uncertainty [15.781915567005251]
We present a novel surrogate model for representation learning and uncertainty quantification.
The proposed model combines a neural network approach for dimensionality reduction of the (potentially high-dimensional) data, with a surrogate model method for learning the data distribution.
Our model enables us to (a) learn a representation of the data, (b) estimate uncertainty in the high-dimensional data system, and (c) match high order moments of the output distribution.
arXiv Detail & Related papers (2022-02-10T14:42:51Z) - A survey of unsupervised learning methods for high-dimensional
uncertainty quantification in black-box-type problems [0.0]
We construct surrogate models for quantification uncertainty (UQ) on complex partial differential equations (PPDEs)
The curse of dimensionality can be a pre-dimensional subspace used with suitable unsupervised learning techniques.
We demonstrate both the advantages and limitations of a suitable m-PCE model and we conclude that a suitable m-PCE model provides a cost-effective approach to deep subspaces.
arXiv Detail & Related papers (2022-02-09T16:33:40Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Supervised Linear Dimension-Reduction Methods: Review, Extensions, and
Comparisons [6.71092092685492]
Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling.
This paper reviews selected techniques, extends some of them, and compares their performance through simulations.
Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study.
arXiv Detail & Related papers (2021-09-09T17:57:25Z) - Repulsive Mixture Models of Exponential Family PCA for Clustering [127.90219303669006]
The mixture extension of exponential family principal component analysis ( EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA.
The traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering.
In this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework.
arXiv Detail & Related papers (2020-04-07T04:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.