Multi-View Symbolic Regression
- URL: http://arxiv.org/abs/2402.04298v4
- Date: Fri, 15 Nov 2024 10:35:03 GMT
- Title: Multi-View Symbolic Regression
- Authors: Etienne Russeil, Fabrício Olivetti de França, Konstantin Malanchev, Bogdan Burlacu, Emille E. O. Ishida, Marion Leroux, Clément Michelin, Guillaume Moinard, Emmanuel Gangler,
- Abstract summary: We present Multi-View Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously.
MvSR fits the evaluated expression to each independent dataset and returns a parametric family of functions.
We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy.
- Score: 1.2334534968968969
- License:
- Abstract: Symbolic regression (SR) searches for analytical expressions representing the relationship between a set of explanatory and response variables. Current SR methods assume a single dataset extracted from a single experiment. Nevertheless, frequently, the researcher is confronted with multiple sets of results obtained from experiments conducted with different setups. Traditional SR methods may fail to find the underlying expression since the parameters of each experiment can be different. In this work we present Multi-View Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously, mimicking experimental environments, and outputs a general parametric solution. This approach fits the evaluated expression to each independent dataset and returns a parametric family of functions f(x; theta) simultaneously capable of accurately fitting all datasets. We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy, for which an a priori analytical expression is not available. Results show that MvSR obtains the correct expression more frequently and is robust to hyperparameters change. In real-world data, it is able to grasp the group behavior, recovering known expressions from the literature as well as promising alternatives, thus enabling the use of SR to a large range of experimental scenarios.
Related papers
- Ab initio nonparametric variable selection for scalable Symbolic Regression with large $p$ [2.222138965069487]
Symbolic regression (SR) is a powerful technique for discovering symbolic expressions that characterize nonlinear relationships in data.
Existing SR methods do not scale to datasets with a large number of input variables, which are common in modern scientific applications.
We propose PAN+SR, which combines ab initio nonparametric variable selection with SR to efficiently pre-screen large input spaces.
arXiv Detail & Related papers (2024-10-17T15:41:06Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Soft Random Sampling: A Theoretical and Empirical Analysis [59.719035355483875]
Soft random sampling (SRS) is a simple yet effective approach for efficient deep neural networks when dealing with massive data.
It selects a uniformly speed at random with replacement from each data set in each epoch.
It is shown to be a powerful and competitive strategy with significant and competitive performance on real-world industrial scale.
arXiv Detail & Related papers (2023-11-21T17:03:21Z) - ParFam -- (Neural Guided) Symbolic Regression Based on Continuous Global Optimization [14.146976111782466]
We present our new approach ParFam to translate the discrete symbolic regression problem into a continuous one.
In combination with a global, this approach results in a highly effective method to tackle the problem of SR.
We also present an extension incorporating a pre-trained transformer network DL-ParFam to guide ParFam.
arXiv Detail & Related papers (2023-10-09T09:01:25Z) - Scalable Neural Symbolic Regression using Control Variables [7.725394912527969]
We propose ScaleSR, a scalable symbolic regression model that leverages control variables to enhance both accuracy and scalability.
The proposed method involves a four-step process. First, we learn a data generator from observed data using deep neural networks (DNNs)
Experimental results demonstrate that the proposed ScaleSR significantly outperforms state-of-the-art baselines in discovering mathematical expressions with multiple variables.
arXiv Detail & Related papers (2023-06-07T18:30:25Z) - Symbolic Regression via Control Variable Genetic Programming [24.408477700506907]
We propose Control Variable Genetic Programming (CVGP) for symbolic regression over many independent variables.
CVGP expedites symbolic expression discovery via customized experiment design.
We show CVGP as an incremental building approach can yield an exponential reduction in the search space when learning a class of expressions.
arXiv Detail & Related papers (2023-05-25T04:11:14Z) - Rethinking Symbolic Regression Datasets and Benchmarks for Scientific
Discovery [12.496525234064888]
This paper revisits datasets and evaluation criteria for Symbolic Regression (SR)
We recreate 120 datasets to discuss the performance of symbolic regression for scientific discovery (SRSD)
arXiv Detail & Related papers (2022-06-21T17:15:45Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - Deep Representational Similarity Learning for analyzing neural
signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA)
DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.