Related papers: Discovering symbolic expressions with parallelized tree search

Discovering symbolic expressions with parallelized tree search

URL: http://arxiv.org/abs/2407.04405v1
Date: Fri, 5 Jul 2024 10:41:15 GMT
Title: Discovering symbolic expressions with parallelized tree search
Authors: Kai Ruan, Ze-Feng Gao, Yike Guo, Hao Sun, Ji-Rong Wen, Yang Liu,
Abstract summary: Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity. We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
Score: 59.92040079807524
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Symbolic regression plays a crucial role in modern scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. A grand challenge lies in the arduous search for parsimonious and generalizable mathematical formulas, in an infinite search space, while intending to fit the training data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity, which essentially hinders the pace of applying symbolic regression for scientific exploration across interdisciplinary domains. To this end, we introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data. Through a series of extensive experiments, we demonstrate the superior accuracy and efficiency of PTS for equation discovery, which greatly outperforms the state-of-the-art baseline models on over 80 synthetic and experimental datasets (e.g., lifting its performance by up to 99% accuracy improvement and one-order of magnitude speed up). PTS represents a key advance in accurate and efficient data-driven discovery of symbolic, interpretable models (e.g., underlying physical laws) and marks a pivotal transition towards scalable symbolic learning.

Related papers

RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library [58.404895570822184]
RV-Syn is a novel mathematical Synthesis approach. It generates graphs as solutions by combining Python-formatted functions from this library. Based on the constructed graph, we achieve solution-guided logic-aware problem generation.
arXiv Detail & Related papers (2025-04-29T04:42:02Z)
Precision Adaptive Imputation Network : An Unified Technique for Mixed Datasets [0.0]
This study introduces the Precision Adaptive Imputation Network (PAIN), a novel algorithm designed to enhance data reconstruction. PAIN employs a tri-step process that integrates statistical methods, random forests, and autoencoders, ensuring balanced accuracy and efficiency in imputation. The findings highlight PAIN's superior ability to preserve data distributions and maintain analytical integrity, particularly in complex scenarios where missingness is not completely at random.
arXiv Detail & Related papers (2025-01-18T06:22:27Z)
An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis [13.530431636528519]
We propose textbfNeural-textbfEnhanced textbfMonte-Carlo textbfTree textbfSearch (NEMoTS) for time series. NEMoTS significantly reduces the search space in symbolic regression and improves expression quality. Experiments with three real-world datasets demonstrate NEMoTS's significant superiority in performance, efficiency, reliability, and interpretability.
arXiv Detail & Related papers (2024-09-06T02:20:13Z)
Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data. Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables. We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z)
Discovering Interpretable Physical Models using Symbolic Regression and Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models. DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z)
Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks. Data augmentation induces these symmetries during training by applying multiple transformations to the input data. This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Accelerating Understanding of Scientific Experiments with End to End Symbolic Regression [12.008215939224382]
We develop a deep neural network to address the problem of learning free-form symbolic expressions from raw data. We train our neural network on a synthetic dataset consisting of data tables of varying length and varying levels of noise. We validate our technique by running on a public dataset from behavioral science.
arXiv Detail & Related papers (2021-12-07T22:28:53Z)
Modeling Item Response Theory with Stochastic Variational Inference [8.369065078321215]
We introduce a variational Bayesian inference algorithm for Item Response Theory (IRT) Applying this method to five large-scale item response datasets yields higher log likelihoods and higher accuracy in imputing missing data. The algorithm implementation is open-source, and easily usable.
arXiv Detail & Related papers (2021-08-26T05:00:27Z)
Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction. One of the main challenges in SER is data scarcity. We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z)
SymbolicGPT: A Generative Transformer Model for Symbolic Regression [3.685455441300801]
We present SymbolicGPT, a novel transformer-based language model for symbolic regression. We show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.
arXiv Detail & Related papers (2021-06-27T03:26:35Z)
Efficient Characterization of Dynamic Response Variation Using Multi-Fidelity Data Fusion through Composite Neural Network [9.446974144044733]
We take advantage of the multi-level response prediction opportunity in structural dynamic analysis. We formulate a composite neural network fusion approach that can fully utilize the multi-level, heterogeneous datasets obtained.
arXiv Detail & Related papers (2020-05-07T02:44:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.