Discovering symbolic expressions with parallelized tree search
- URL: http://arxiv.org/abs/2407.04405v1
- Date: Fri, 5 Jul 2024 10:41:15 GMT
- Title: Discovering symbolic expressions with parallelized tree search
- Authors: Kai Ruan, Ze-Feng Gao, Yike Guo, Hao Sun, Ji-Rong Wen, Yang Liu,
- Abstract summary: Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity.
We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
- Score: 59.92040079807524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symbolic regression plays a crucial role in modern scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. A grand challenge lies in the arduous search for parsimonious and generalizable mathematical formulas, in an infinite search space, while intending to fit the training data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity, which essentially hinders the pace of applying symbolic regression for scientific exploration across interdisciplinary domains. To this end, we introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data. Through a series of extensive experiments, we demonstrate the superior accuracy and efficiency of PTS for equation discovery, which greatly outperforms the state-of-the-art baseline models on over 80 synthetic and experimental datasets (e.g., lifting its performance by up to 99% accuracy improvement and one-order of magnitude speed up). PTS represents a key advance in accurate and efficient data-driven discovery of symbolic, interpretable models (e.g., underlying physical laws) and marks a pivotal transition towards scalable symbolic learning.
Related papers
- An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis [13.530431636528519]
We propose textbfNeural-textbfEnhanced textbfMonte-Carlo textbfTree textbfSearch (NEMoTS) for time series.
NEMoTS significantly reduces the search space in symbolic regression and improves expression quality.
Experiments with three real-world datasets demonstrate NEMoTS's significant superiority in performance, efficiency, reliability, and interpretability.
arXiv Detail & Related papers (2024-09-06T02:20:13Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Accelerating Understanding of Scientific Experiments with End to End
Symbolic Regression [12.008215939224382]
We develop a deep neural network to address the problem of learning free-form symbolic expressions from raw data.
We train our neural network on a synthetic dataset consisting of data tables of varying length and varying levels of noise.
We validate our technique by running on a public dataset from behavioral science.
arXiv Detail & Related papers (2021-12-07T22:28:53Z) - Modeling Item Response Theory with Stochastic Variational Inference [8.369065078321215]
We introduce a variational Bayesian inference algorithm for Item Response Theory (IRT)
Applying this method to five large-scale item response datasets yields higher log likelihoods and higher accuracy in imputing missing data.
The algorithm implementation is open-source, and easily usable.
arXiv Detail & Related papers (2021-08-26T05:00:27Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - SymbolicGPT: A Generative Transformer Model for Symbolic Regression [3.685455441300801]
We present SymbolicGPT, a novel transformer-based language model for symbolic regression.
We show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.
arXiv Detail & Related papers (2021-06-27T03:26:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.