Related papers: GSR: A Generalized Symbolic Regression Approach

GSR: A Generalized Symbolic Regression Approach

URL: http://arxiv.org/abs/2205.15569v1
Date: Tue, 31 May 2022 07:20:17 GMT
Title: GSR: A Generalized Symbolic Regression Approach
Authors: Tony Tohme, Dehong Liu, Kamal Youcef-Toumi
Abstract summary: Generalized Symbolic Regression presented in this paper. We show that our GSR method outperforms several state-of-the-art methods on the well-known Symbolic Regression benchmark problem sets. We highlight the strengths of GSR by introducing SymSet, a new SR benchmark set which is more challenging relative to the existing benchmarks.
Score: 13.606672419862047
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Identifying the mathematical relationships that best describe a dataset remains a very challenging problem in machine learning, and is known as Symbolic Regression (SR). In contrast to neural networks which are often treated as black boxes, SR attempts to gain insight into the underlying relationships between the independent variables and the target variable of a given dataset by assembling analytical functions. In this paper, we present GSR, a Generalized Symbolic Regression approach, by modifying the conventional SR optimization problem formulation, while keeping the main SR objective intact. In GSR, we infer mathematical relationships between the independent variables and some transformation of the target variable. We constrain our search space to a weighted sum of basis functions, and propose a genetic programming approach with a matrix-based encoding scheme. We show that our GSR method outperforms several state-of-the-art methods on the well-known SR benchmark problem sets. Finally, we highlight the strengths of GSR by introducing SymSet, a new SR benchmark set which is more challenging relative to the existing benchmarks.

Related papers

A Simplified Analysis of SGD for Linear Regression with Weight Averaging [64.2393952273612]
Recent work bycitetzou 2021benign provides sharp rates for SGD optimization in linear regression using constant learning rate.<n>We provide a simplified analysis recovering the same bias and variance bounds provided incitepzou 2021benign based on simple linear algebra tools.<n>We believe our work makes the analysis of gradient descent on linear regression very accessible and will be helpful in further analyzing mini-batching and learning rate scheduling.
arXiv Detail & Related papers (2025-06-18T15:10:38Z)
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z)
Interpretable Robotic Friction Learning via Symbolic Regression [52.41267112707149]
Accurately modeling the friction torque in robotic joints has long been challenging due to the request for a robust mathematical description.<n>Traditional model-based approaches are often labor-intensive, requiring extensive experiments and expert knowledge.<n>Data-driven methods based on neural networks are easier to implement but often lack robustness.
arXiv Detail & Related papers (2025-05-19T14:44:02Z)
Call for Action: towards the next generation of symbolic regression benchmark [2.7253033812941387]
Symbolic Regression is a powerful technique for discovering interpretable mathematical expressions.<n> benchmarking SR methods remains challenging due to the diversity of algorithms, datasets, and evaluation criteria.
arXiv Detail & Related papers (2025-05-06T21:02:20Z)
Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
Ab initio nonparametric variable selection for scalable Symbolic Regression with large $p$ [2.222138965069487]
Symbolic regression (SR) is a powerful technique for discovering symbolic expressions that characterize nonlinear relationships in data. Existing SR methods do not scale to datasets with a large number of input variables, which are common in modern scientific applications. We propose PAN+SR, which combines ab initio nonparametric variable selection with SR to efficiently pre-screen large input spaces.
arXiv Detail & Related papers (2024-10-17T15:41:06Z)
Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance. We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features. In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z)
ISR: Invertible Symbolic Regression [7.499800486499609]
Invertible Symbolic Regression is a machine learning technique that generates analytical relationships between inputs and outputs of a given dataset via invertible maps. We transform the affine coupling blocks of INNs into a symbolic framework, resulting in an end-to-end differentiable symbolic invertible architecture. We show that ISR can serve as a (symbolic) normalizing flow for density estimation tasks.
arXiv Detail & Related papers (2024-05-10T23:20:46Z)
Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution. To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR. Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z)
Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications [64.20396555814513]
This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA) We exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains. MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.
arXiv Detail & Related papers (2024-01-24T19:04:49Z)
Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data. Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables. We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z)
GFN-SR: Symbolic Regression with Generative Flow Networks [0.9208007322096533]
In recent years, deep symbolic regression (DSR) has emerged as a popular method in the field. We propose an alternative framework (GFN-SR) to approach SR with deep learning. GFN-SR is capable of generating a diverse set of best-fitting expressions.
arXiv Detail & Related papers (2023-12-01T07:38:05Z)
ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization [14.146976111782466]
We present our new approach ParFam to translate the discrete symbolic regression problem into a continuous one. In combination with a global, this approach results in a highly effective method to tackle the problem of SR. We also present an extension incorporating a pre-trained transformer network DL-ParFam to guide ParFam.
arXiv Detail & Related papers (2023-10-09T09:01:25Z)
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator. This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z)
Transformer-based Planning for Symbolic Regression [18.90700817248397]
We propose TPSR, a Transformer-based Planning strategy for Symbolic Regression. Unlike conventional decoding strategies, TPSR enables the integration of non-differentiable feedback, such as fitting accuracy and complexity. Our approach outperforms state-of-the-art methods, enhancing the model's fitting-complexity trade-off, Symbolic abilities, and robustness to noise.
arXiv Detail & Related papers (2023-03-13T03:29:58Z)
Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.