Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective
- URL: http://arxiv.org/abs/2211.11680v1
- Date: Mon, 21 Nov 2022 17:48:44 GMT
- Title: Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective
- Authors: Alice E. A. Allen, Alexandre Tkatchenko
- Abstract summary: We show how flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models.
We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models.
- Score: 77.53142165205281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from data has led to substantial advances in a multitude of
disciplines, including text and multimedia search, speech recognition, and
autonomous-vehicle navigation. Can machine learning enable similar leaps in the
natural and social sciences? This is certainly the expectation in many
scientific fields and recent years have seen a plethora of applications of
non-linear models to a wide range of datasets. However, flexible non-linear
solutions will not always improve upon manually adding transforms and
interactions between variables to linear regression models. We discuss how to
recognize this before constructing a data-driven model and how such analysis
can help us move to intrinsically interpretable regression models. Furthermore,
for a variety of applications in the natural and social sciences we demonstrate
why improvements may be seen with more complex regression models and why they
may not.
Related papers
- Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations [52.11801730860999]
In recent years, the robot learning community has shown increasing interest in using deep generative models to capture the complexity of large datasets.
We present the different types of models that the community has explored, such as energy-based models, diffusion models, action value maps, or generative adversarial networks.
We also present the different types of applications in which deep generative models have been used, from grasp generation to trajectory generation or cost learning.
arXiv Detail & Related papers (2024-08-08T11:34:31Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - A spectrum of physics-informed Gaussian processes for regression in
engineering [0.0]
Despite the growing availability of sensing and data in general, we remain unable to fully characterise many in-service engineering systems and structures from a purely data-driven approach.
This paper pursues the combination of machine learning technology and physics-based reasoning to enhance our ability to make predictive models with limited data.
arXiv Detail & Related papers (2023-09-19T14:39:03Z) - Interpreting and generalizing deep learning in physics-based problems with functional linear models [1.1440052544554358]
Interpretability is crucial and often desired in modeling physical systems.
We present test cases in solid mechanics, fluid mechanics, and transport.
Our study underscores the significance of interpretable representation in scientific machine learning.
arXiv Detail & Related papers (2023-07-10T14:01:29Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications.
Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z) - LocalGLMnet: interpretable deep learning for tabular data [0.0]
We propose a new network architecture that shares similar features as generalized linear models.
Our approach provides an additive decomposition in the spirit of Shapley values and integrated gradients.
arXiv Detail & Related papers (2021-07-23T07:38:33Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Flexible Bayesian Nonlinear Model Configuration [10.865434331546126]
Linear, or simple parametric, models are often not sufficient to describe complex relationships between input variables and a response.
We introduce a flexible approach for the construction and selection of highly flexible nonlinear parametric regression models.
A genetically modified mode jumping chain Monte Carlo algorithm is adopted to perform Bayesian inference.
arXiv Detail & Related papers (2020-03-05T21:20:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.