Transformer Semantic Genetic Programming for d-dimensional Symbolic Regression Problems
- URL: http://arxiv.org/abs/2511.09416v1
- Date: Thu, 13 Nov 2025 01:53:10 GMT
- Title: Transformer Semantic Genetic Programming for d-dimensional Symbolic Regression Problems
- Authors: Philipp Anthes, Dominik Sobania, Franz Rothlauf,
- Abstract summary: Transformer Semantic Genetic Programming (TSGP) is a semantic search approach that uses a pre-trained transformer model as a variation operator.<n>We find that a single transformer model trained on millions of programs is able to generalize across symbolic regression problems of varying dimension.<n> Evaluated on 24 real-world and synthetic datasets, TSGP significantly outperforms standard GP, SLIM_GSGP, Deep Symbolic Regression, and Denoising Autoencoder GP.
- Score: 1.7205106391379026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer Semantic Genetic Programming (TSGP) is a semantic search approach that uses a pre-trained transformer model as a variation operator to generate offspring programs with controlled semantic similarity to a given parent. Unlike other semantic GP approaches that rely on fixed syntactic transformations, TSGP aims to learn diverse structural variations that lead to solutions with similar semantics. We find that a single transformer model trained on millions of programs is able to generalize across symbolic regression problems of varying dimension. Evaluated on 24 real-world and synthetic datasets, TSGP significantly outperforms standard GP, SLIM_GSGP, Deep Symbolic Regression, and Denoising Autoencoder GP, achieving an average rank of 1.58 across all benchmarks. Moreover, TSGP produces more compact solutions than SLIM_GSGP, despite its higher accuracy. In addition, the target semantic distance $\mathrm{SD}_t$ is able to control the step size in the semantic space: small values of $\mathrm{SD}_t$ enable consistent improvement in fitness but often lead to larger programs, while larger values promote faster convergence and compactness. Thus, $\mathrm{SD}_t$ provides an effective mechanism for balancing exploration and exploitation.
Related papers
- Beyond Quantity: Trajectory Diversity Scaling for Code Agents [51.71414642763219]
Trajectory Diversity Scaling is a data synthesis framework for code agents that scales performance through diversity rather than raw volume.<n> TDScaling integrates four innovations: (1) a Business Cluster mechanism that captures real-service logical dependencies; (2) a blueprint-driven multi-agent paradigm that enforces trajectory coherence; and (3) an adaptive evolution mechanism that steers toward long-tail scenarios.
arXiv Detail & Related papers (2026-02-03T07:43:03Z) - Variational Entropic Optimal Transport [67.76725267984578]
We propose Variational Entropic Optimal Transport (VarEOT) for domain translation problems.<n>VarEOT is based on an exact variational reformulation of the log-partition $log mathbbE[exp(cdot)$ as a tractable generalization over an auxiliary positive normalizer.<n> Experiments on synthetic data and unpaired image-to-image translation demonstrate competitive or improved translation quality.
arXiv Detail & Related papers (2026-02-02T15:48:44Z) - Transformer Semantic Genetic Programming for Symbolic Regression [0.8602553195689513]
Transformer Genetic Programming (TSGP) is a novel and flexible semantic approach that uses a generative transformer model as search operator.<n>The model is trained on synthetic test problems and learns semantic similarities between solutions.<n>An analysis of the search dynamic reveals that the solutions generated by TSGP are semantically more similar than the solutions generated by the benchmark approaches.
arXiv Detail & Related papers (2025-01-30T16:51:44Z) - Repeat After Me: Transformers are Better than State Space Models at Copying [53.47717661441142]
We show that while generalized state space models are promising in terms of inference-time efficiency, they are limited compared to transformer models on tasks that require copying from the input context.
arXiv Detail & Related papers (2024-02-01T21:44:11Z) - SymbolNet: Neural Symbolic Regression with Adaptive Dynamic Pruning for Compression [1.0356366043809717]
We propose $ttSymbolNet$, a neural network approach to symbolic regression specifically designed as a model compression technique.<n>This framework allows dynamic pruning of model weights, input features, and mathematical operators in a single training process.
arXiv Detail & Related papers (2024-01-18T12:51:38Z) - Deep Transformed Gaussian Processes [0.0]
Transformed Gaussian Processes (TGPs) are processes specified by transforming samples from the joint distribution from a prior process (typically a GP) using an invertible transformation.
We propose a generalization of TGPs named Deep Transformed Gaussian Processes (DTGPs), which follows the trend of concatenating layers of processes.
Experiments conducted evaluate the proposed DTGPs in multiple regression datasets, achieving good scalability and performance.
arXiv Detail & Related papers (2023-10-27T16:09:39Z) - Transformers as Support Vector Machines [54.642793677472724]
We establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem.
We characterize the implicit bias of 1-layer transformers optimized with gradient descent.
We believe these findings inspire the interpretation of transformers as a hierarchy of SVMs that separates and selects optimal tokens.
arXiv Detail & Related papers (2023-08-31T17:57:50Z) - Differentiable Genetic Programming for High-dimensional Symbolic
Regression [13.230237932229052]
Symbolic regression (SR) is considered an effective way to reach interpretable machine learning (ML)
Genetic programming (GP) has been the dominator in solving SR problems.
We propose a differentiable approach named DGP to construct GP trees towards high-dimensional SR.
arXiv Detail & Related papers (2023-04-18T11:39:45Z) - NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer [45.47667026025716]
We propose a novel, robust and accelerated iteration that relies on two key elements.
The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively.
We show that NAG-arity is competitive with state-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models.
arXiv Detail & Related papers (2022-09-29T16:54:53Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Statistically Meaningful Approximation: a Case Study on Approximating
Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability.
We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z) - SGP-DT: Semantic Genetic Programming Based on Dynamic Targets [6.841231589814175]
This paper presents a new Semantic GP approach based on Dynamic Target (SGP-DT)
The evolution in each run is guided by a new (dynamic) target based on the residual errors.
SGP-DT achieves small RMSE values, on average 23.19% smaller than the one of epsilon-lexicase.
arXiv Detail & Related papers (2020-01-30T19:33:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.