Related papers: RSRM: Reinforcement Symbolic Regression Machine

RSRM: Reinforcement Symbolic Regression Machine

URL: http://arxiv.org/abs/2305.14656v1
Date: Wed, 24 May 2023 02:51:45 GMT
Title: RSRM: Reinforcement Symbolic Regression Machine
Authors: Yilong Xu, Yang Liu, Hao Sun
Abstract summary: We propose a novel Reinforcement Regression Machine that masters the capability of uncovering complex math equations from only scarce data. The RSRM model is composed of three key modules: (1) a Monte Carlo tree search (MCTS) agent that explores optimal math expression trees consisting of pre-defined math operators and variables, (2) a Double Q-learning block that helps reduce the feasible search space of MCTS via properly understanding the distribution of reward, and (3) a modulated sub-tree discovery block that distills new math operators to improve representation ability of math expression trees.
Score: 13.084113582897965
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In nature, the behaviors of many complex systems can be described by parsimonious math equations. Automatically distilling these equations from limited data is cast as a symbolic regression process which hitherto remains a grand challenge. Keen efforts in recent years have been placed on tackling this issue and demonstrated success in symbolic regression. However, there still exist bottlenecks that current methods struggle to break when the discrete search space tends toward infinity and especially when the underlying math formula is intricate. To this end, we propose a novel Reinforcement Symbolic Regression Machine (RSRM) that masters the capability of uncovering complex math equations from only scarce data. The RSRM model is composed of three key modules: (1) a Monte Carlo tree search (MCTS) agent that explores optimal math expression trees consisting of pre-defined math operators and variables, (2) a Double Q-learning block that helps reduce the feasible search space of MCTS via properly understanding the distribution of reward, and (3) a modulated sub-tree discovery block that heuristically learns and defines new math operators to improve representation ability of math expression trees. Biding of these modules yields the state-of-the-art performance of RSRM in symbolic regression as demonstrated by multiple sets of benchmark examples. The RSRM model shows clear superiority over several representative baseline models.

Related papers

MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion. We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset. We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z)
ViSymRe: Vision-guided Multimodal Symbolic Regression [12.486013697763228]
We propose a vision-guided multimodal symbolic regression model called ViSymRe. It integrates vision, symbol and numeric to enhance symbolic regression. It emphasizes the simplicity and structural rationality of the equations rather than merely numerical fitting.
arXiv Detail & Related papers (2024-12-15T10:05:31Z)
Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity. We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z)
MMSR: Symbolic Regression is a Multimodal Task [12.660401635672967]
Symbolic regression was originally formulated as a optimization problem, and GP and reinforcement learning algorithms were used to solve it. To solve this problem, researchers treat the mapping from data to expressions as a translation problem. In this paper, we propose MMSR, which achieves the most advanced results on multiple mainstream datasets.
arXiv Detail & Related papers (2024-02-28T08:29:42Z)
Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data. Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables. We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z)
A Transformer Model for Symbolic Regression towards Scientific Discovery [11.827358526480323]
Symbolic Regression (SR) searches for mathematical expressions which best describe numerical datasets. We propose a new Transformer model aiming at Symbolic Regression particularly focused on its application for Scientific Discovery. We apply our best model to the SRSD datasets which yields state-of-the-art results using the normalized tree-based edit distance.
arXiv Detail & Related papers (2023-12-07T06:27:48Z)
Continual Referring Expression Comprehension via Dual Modular Memorization [133.46886428655426]
Referring Expression (REC) aims to localize an image region of a given object described by a natural-language expression. Existing REC algorithms make a strong assumption that training data feeding into a model are given upfront, which degrades its practicality for real-world scenarios. In this paper, we propose Continual Referring Expression (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks. In order to continuously improve the model on sequential tasks without forgetting prior learned knowledge and without repeatedly re-training from a scratch, we propose an effective baseline method named Dual Modular Memorization
arXiv Detail & Related papers (2023-11-25T02:58:51Z)
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models [68.65075559137608]
We propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proofs but also evaluates a generative LM's reasoning ability on formulas. We gather trigonometric expressions and their reduced forms from the web, annotate the simplification process manually, and translate it into the Lean formal language system. We develop an automatic generator based on Lean-Gym to create dataset splits of varying difficulties and distributions in order to thoroughly analyze the model's generalization ability.
arXiv Detail & Related papers (2023-10-16T08:42:39Z)
Discovering Interpretable Physical Models using Symbolic Regression and Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models. DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z)
Scalable Neural Symbolic Regression using Control Variables [7.725394912527969]
We propose ScaleSR, a scalable symbolic regression model that leverages control variables to enhance both accuracy and scalability. The proposed method involves a four-step process. First, we learn a data generator from observed data using deep neural networks (DNNs) Experimental results demonstrate that the proposed ScaleSR significantly outperforms state-of-the-art baselines in discovering mathematical expressions with multiple variables.
arXiv Detail & Related papers (2023-06-07T18:30:25Z)
Incorporating Background Knowledge in Symbolic Regression using a Computer Algebra System [0.0]
Symbolic Regression (SR) can generate interpretable, concise expressions that fit a given dataset. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture.
arXiv Detail & Related papers (2023-01-27T18:59:25Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.