Bayesian Symbolic Regression via Posterior Sampling
- URL: http://arxiv.org/abs/2512.10849v1
- Date: Thu, 11 Dec 2025 17:38:20 GMT
- Title: Bayesian Symbolic Regression via Posterior Sampling
- Authors: Geoffrey F. Bomarito, Patrick E. Leser,
- Abstract summary: Symbolic regression is a powerful tool for discovering governing equations directly from data, but its sensitivity to noise hinders its broader application.<n>This paper introduces a Sequential Monte Carlo framework for Bayesian symbolic regression that approximates the posterior distribution over symbolic expressions.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Symbolic regression is a powerful tool for discovering governing equations directly from data, but its sensitivity to noise hinders its broader application. This paper introduces a Sequential Monte Carlo (SMC) framework for Bayesian symbolic regression that approximates the posterior distribution over symbolic expressions, enhancing robustness and enabling uncertainty quantification for symbolic regression in the presence of noise. Differing from traditional genetic programming approaches, the SMC-based algorithm combines probabilistic selection, adaptive tempering, and the use of normalized marginal likelihood to efficiently explore the search space of symbolic expressions, yielding parsimonious expressions with improved generalization. When compared to standard genetic programming baselines, the proposed method better deals with challenging, noisy benchmark datasets. The reduced tendency to overfit and enhanced ability to discover accurate and interpretable equations paves the way for more robust symbolic regression in scientific discovery and engineering design applications.
Related papers
- VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees [2.6521352889229446]
We introduce VaSST, a scalable probabilistic framework for symbolic regression based on variational inference.<n>VaSST achieves superior performance in both structural recovery and predictive accuracy compared to state-of-the-art symbolic regression methods.
arXiv Detail & Related papers (2026-02-27T00:07:31Z) - SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z) - Hierarchical Bayesian Operator-induced Symbolic Regression Trees for Structural Learning of Scientific Expressions [3.8545239266455185]
We develop a hierarchical Bayesian framework for symbolic regression that represents scientific laws as ensembles of tree-structured symbolic expressions with a regularized tree prior.<n>We establish near-minimax rate of Bayesian posterior concentration, providing the first rigorous guarantee in context of symbolic regression.<n> Empirical evaluation demonstrates robust performance of our proposed methodology against state-of-the-art competing modules.
arXiv Detail & Related papers (2025-09-24T02:42:25Z) - Discovering Mathematical Equations with Diffusion Language Model [6.384075523245284]
We introduce DiffuSR, a pre-training framework for symbolic regression built upon a continuous-state diffusion language model.<n>DrouSR employs a trainable embedding layer within the diffusion process to map discrete mathematical symbols into a continuous latent space.<n>We also design an effective inference strategy to enhance the accuracy of the diffusion-based equation generator.
arXiv Detail & Related papers (2025-09-16T14:53:44Z) - Interactive Symbolic Regression through Offline Reinforcement Learning: A Co-Design Framework [11.804368618793273]
Symbolic Regression holds great potential for uncovering underlying mathematical and physical relationships from observed data.<n>Current state-of-the-art approaches typically do not consider the integration of domain experts' prior knowledge.<n>We propose the Symbolic Q-network (Sym-Q), an advanced interactive framework for large-scale symbolic regression.
arXiv Detail & Related papers (2024-02-07T22:53:54Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling.
deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective.
This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z) - Discretization and Re-synthesis: an alternative method to solve the
Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem.
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols.
By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z) - SymbolicGPT: A Generative Transformer Model for Symbolic Regression [3.685455441300801]
We present SymbolicGPT, a novel transformer-based language model for symbolic regression.
We show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.
arXiv Detail & Related papers (2021-06-27T03:26:35Z) - Neural Symbolic Regression that Scales [58.45115548924735]
We introduce the first symbolic regression method that leverages large scale pre-training.
We procedurally generate an unbounded set of equations, and simultaneously pre-train a Transformer to predict the symbolic equation from a corresponding set of input-output-pairs.
arXiv Detail & Related papers (2021-06-11T14:35:22Z) - AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph
modularity [8.594811303203581]
We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal.
It improves on the previous state-of-the-art by typically being orders of magnitude more robust toward noise and bad data.
We develop a method for discovering generalized symmetries from gradient properties of a neural network fit.
arXiv Detail & Related papers (2020-06-18T18:01:19Z) - Closed Loop Neural-Symbolic Learning via Integrating Neural Perception,
Grammar Parsing, and Symbolic Reasoning [134.77207192945053]
Prior methods learn the neural-symbolic models using reinforcement learning approaches.
We introduce the textbfgrammar model as a textitsymbolic prior to bridge neural perception and symbolic reasoning.
We propose a novel textbfback-search algorithm which mimics the top-down human-like learning procedure to propagate the error.
arXiv Detail & Related papers (2020-06-11T17:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.