Quantile-constrained Wasserstein projections for robust interpretability
of numerical and machine learning models
- URL: http://arxiv.org/abs/2209.11539v1
- Date: Fri, 23 Sep 2022 11:58:03 GMT
- Title: Quantile-constrained Wasserstein projections for robust interpretability
of numerical and machine learning models
- Authors: Marouane Il Idrissi (EDF R&D PRISME, SINCLAIR AI Lab, IMT), Nicolas
Bousquet (EDF R&D PRISME, SINCLAIR AI Lab, LPSM), Fabrice Gamboa (IMT),
Bertrand Iooss (EDF R&D PRISME, SINCLAIR AI Lab, IMT, GdR MASCOT-NUM),
Jean-Michel Loubes (IMT)
- Abstract summary: The study of black-box models is often based on sensitivity analysis involving a probabilistic structure imposed on the inputs.
Our work aim at unifying the UQ and ML interpretability approaches, by providing relevant and easy-to-use tools for both paradigms.
- Score: 18.771531343438227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness studies of black-box models is recognized as a necessary task for
numerical models based on structural equations and predictive models learned
from data. These studies must assess the model's robustness to possible
misspecification of regarding its inputs (e.g., covariate shift). The study of
black-box models, through the prism of uncertainty quantification (UQ), is
often based on sensitivity analysis involving a probabilistic structure imposed
on the inputs, while ML models are solely constructed from observed data. Our
work aim at unifying the UQ and ML interpretability approaches, by providing
relevant and easy-to-use tools for both paradigms. To provide a generic and
understandable framework for robustness studies, we define perturbations of
input information relying on quantile constraints and projections with respect
to the Wasserstein distance between probability measures, while preserving
their dependence structure. We show that this perturbation problem can be
analytically solved. Ensuring regularity constraints by means of isotonic
polynomial approximations leads to smoother perturbations, which can be more
suitable in practice. Numerical experiments on real case studies, from the UQ
and ML fields, highlight the computational feasibility of such studies and
provide local and global insights on the robustness of black-box models to
input perturbations.
Related papers
- Physics-constrained polynomial chaos expansion for scientific machine learning and uncertainty quantification [6.739642016124097]
We present a novel physics-constrained chaos expansion as a surrogate modeling method capable of performing both scientific machine learning (SciML) and uncertainty quantification (UQ) tasks.
The proposed method seamlessly integrates SciML into UQ and vice versa, which allows it to quantify the uncertainties in SciML tasks effectively and leverage SciML for improved uncertainty assessment during UQ-related tasks.
arXiv Detail & Related papers (2024-02-23T06:04:15Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Robust Neural Posterior Estimation and Statistical Model Criticism [1.5749416770494706]
We argue that modellers must treat simulators as idealistic representations of the true data generating process.
In this work we revisit neural posterior estimation (NPE), a class of algorithms that enable black-box parameter inference in simulation models.
We find that the presence of misspecification, in contrast, leads to unreliable inference when NPE is used naively.
arXiv Detail & Related papers (2022-10-12T20:06:55Z) - Robustness of Machine Learning Models Beyond Adversarial Attacks [0.0]
We show that the widely used concept of adversarial robustness and closely related metrics are not necessarily valid metrics for determining the robustness of ML models.
We propose a flexible approach that models possible perturbations in input data individually for each application.
This is then combined with a probabilistic approach that computes the likelihood that a real-world perturbation will change a prediction.
arXiv Detail & Related papers (2022-04-21T12:09:49Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - A data-driven peridynamic continuum model for upscaling molecular
dynamics [3.1196544696082613]
We propose a learning framework to extract, from molecular dynamics data, an optimal Linear Peridynamic Solid model.
We provide sufficient well-posedness conditions for discretized LPS models with sign-changing influence functions.
This framework guarantees that the resulting model is mathematically well-posed, physically consistent, and that it generalizes well to settings that are different from the ones used during training.
arXiv Detail & Related papers (2021-08-04T07:07:47Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Calibrating Over-Parametrized Simulation Models: A Framework via
Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees.
We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.