Interpretable Scientific Discovery with Symbolic Regression: A Review
- URL: http://arxiv.org/abs/2211.10873v2
- Date: Tue, 2 May 2023 12:09:58 GMT
- Title: Interpretable Scientific Discovery with Symbolic Regression: A Review
- Authors: Nour Makke and Sanjay Chawla
- Abstract summary: Symbolic regression is emerging as a promising machine learning method for learning mathematical expressions directly from data.
This survey presents a structured and comprehensive overview of symbolic regression methods and discusses their strengths and limitations.
- Score: 8.414043731621419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symbolic regression is emerging as a promising machine learning method for
learning succinct underlying interpretable mathematical expressions directly
from data. Whereas it has been traditionally tackled with genetic programming,
it has recently gained a growing interest in deep learning as a data-driven
model discovery method, achieving significant advances in various application
domains ranging from fundamental to applied sciences. This survey presents a
structured and comprehensive overview of symbolic regression methods and
discusses their strengths and limitations.
Related papers
- Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity.
We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z) - Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data.
One straightforward solution is to integrate statistical analysis and machine learning.
Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z) - A Comparison of Recent Algorithms for Symbolic Regression to Genetic Programming [0.0]
Symbolic regression aims to model and map data in a way that can be understood by scientists.
Recent advancements, have attempted to bridge the gap between these two fields.
arXiv Detail & Related papers (2024-06-05T19:01:43Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Learned reconstruction methods for inverse problems: sample error
estimates [0.8702432681310401]
This dissertation addresses the generalization properties of learned reconstruction methods, and specifically to perform their sample error analysis.
A rather general strategy is proposed, whose assumptions are met for a large class of inverse problems and learned methods.
arXiv Detail & Related papers (2023-12-21T17:56:19Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective [77.53142165205281]
We show how flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models.
We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models.
arXiv Detail & Related papers (2022-11-21T17:48:44Z) - Symbolic Regression for Space Applications: Differentiable Cartesian
Genetic Programming Powered by Multi-objective Memetic Algorithms [10.191757341020216]
We propose a new multi-objective memetic algorithm that exploits a differentiable Cartesian Genetic Programming encoding to learn constants during evolutionary loops.
We show that this approach is competitive or outperforms machine learned black box regression models or hand-engineered fits for two applications from space: the Mars express thermal power estimation and the determination of the age of stars by gyrochronology.
arXiv Detail & Related papers (2022-06-13T14:44:15Z) - A Reinforcement Learning Approach to Domain-Knowledge Inclusion Using
Grammar Guided Symbolic Regression [0.0]
We propose a Reinforcement-Based Grammar-Guided Symbolic Regression (RBG2-SR) method.
RBG2-SR constrains the representational space with domain-knowledge using context-free grammar as reinforcement action space.
We show that our method is competitive against other state-of-the-art methods on the benchmarks and offers the best error-complexity trade-off.
arXiv Detail & Related papers (2022-02-09T10:13:14Z) - SymbolicGPT: A Generative Transformer Model for Symbolic Regression [3.685455441300801]
We present SymbolicGPT, a novel transformer-based language model for symbolic regression.
We show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.
arXiv Detail & Related papers (2021-06-27T03:26:35Z) - Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory
to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression.
We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.