Probabilistic Regular Tree Priors for Scientific Symbolic Reasoning
- URL: http://arxiv.org/abs/2306.08506v2
- Date: Mon, 10 Jun 2024 04:39:52 GMT
- Title: Probabilistic Regular Tree Priors for Scientific Symbolic Reasoning
- Authors: Tim Schneider, Amin Totounferoush, Wolfgang Nowak, Steffen Staab,
- Abstract summary: Symbolic Regression allows for the discovery of scientific equations from data.
There is a mismatch between context-free grammars required to express the set of syntactically correct equations, and a tree structure of the latter.
Our contributions are to (i) compactly express experts' prior beliefs about which equations are more likely to be expected by probabilistic Regular Tree Expressions (pRTE) and (ii) adapt Bayesian inference to make such priors efficiently available for symbolic regression encoded as finite state machines.
- Score: 12.369006238950092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symbolic Regression (SR) allows for the discovery of scientific equations from data. To limit the large search space of possible equations, prior knowledge has been expressed in terms of formal grammars that characterize subsets of arbitrary strings. However, there is a mismatch between context-free grammars required to express the set of syntactically correct equations, missing closure properties of the former, and a tree structure of the latter. Our contributions are to (i) compactly express experts' prior beliefs about which equations are more likely to be expected by probabilistic Regular Tree Expressions (pRTE), and (ii) adapt Bayesian inference to make such priors efficiently available for symbolic regression encoded as finite state machines. Our scientific case studies show its effectiveness in soil science to find sorption isotherms and for modeling hyper-elastic materials.
Related papers
- Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity.
We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - A Hybrid System for Systematic Generalization in Simple Arithmetic
Problems [70.91780996370326]
We propose a hybrid system capable of solving arithmetic problems that require compositional and systematic reasoning over sequences of symbols.
We show that the proposed system can accurately solve nested arithmetical expressions even when trained only on a subset including the simplest cases.
arXiv Detail & Related papers (2023-06-29T18:35:41Z) - Incorporating Background Knowledge in Symbolic Regression using a
Computer Algebra System [0.0]
Symbolic Regression (SR) can generate interpretable, concise expressions that fit a given dataset.
We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture.
arXiv Detail & Related papers (2023-01-27T18:59:25Z) - DISCOVER: Deep identification of symbolic open-form PDEs via enhanced
reinforcement-learning [0.5156484100374059]
The working mechanisms of complex natural systems tend to abide by concise and profound partial differential equations (PDEs)
In this paper, an enhanced deep reinforcement-learning framework is proposed to uncover symbolic open-form PDEs with little prior knowledge.
arXiv Detail & Related papers (2022-10-04T15:46:53Z) - Complex Event Forecasting with Prediction Suffix Trees: Extended
Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events.
There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine.
We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z) - A Study of Continuous Vector Representationsfor Theorem Proving [2.0518509649405106]
We develop an encoding that allows for logical properties to be preserved and is additionally reversible.
This means that the tree shape of a formula including all symbols can be reconstructed from the dense vector representation.
We propose datasets that can be used to train these syntactic and semantic properties.
arXiv Detail & Related papers (2021-01-22T15:04:54Z) - Probabilistic Grammars for Equation Discovery [0.0]
We propose the use of probabilistic context-free grammars in equation discovery.
Probability grammars can be used to elegantly and flexibly formulate the parsimony principle.
arXiv Detail & Related papers (2020-12-01T11:59:19Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.