Alleviating Overfitting in Transformation-Interaction-Rational Symbolic Regression with Multi-Objective Optimization
- URL: http://arxiv.org/abs/2501.01905v1
- Date: Fri, 03 Jan 2025 17:21:05 GMT
- Title: Alleviating Overfitting in Transformation-Interaction-Rational Symbolic Regression with Multi-Objective Optimization
- Authors: Fabricio Olivetti de Franca,
- Abstract summary: The performance of using Genetic Programming with the Transformation-Interaction-Rational representation was substantially better than with its predecessor.
We extend Transformation-Interaction-Rational to support multi-objective optimization, specifically the NSGA-II algorithm, and apply that to the same benchmark.
- Score: 0.0
- License:
- Abstract: The Transformation-Interaction-Rational is a representation for symbolic regression that limits the search space of functions to the ratio of two nonlinear functions each one defined as the linear regression of transformed variables. This representation has the main objective to bias the search towards simpler expressions while keeping the approximation power of standard approaches. The performance of using Genetic Programming with this representation was substantially better than with its predecessor (Interaction-Transformation) and ranked close to the state-of-the-art on a contemporary Symbolic Regression benchmark. On a closer look at these results, we observed that the performance could be further improved with an additional selective pressure for smaller expressions when the dataset contains just a few data points. The introduction of a penalization term applied to the fitness measure improved the results on these smaller datasets. One problem with this approach is that it introduces two additional hyperparameters: i) a criteria to when the penalization should be activated and, ii) the amount of penalization to the fitness function. In this paper, we extend Transformation-Interaction-Rational to support multi-objective optimization, specifically the NSGA-II algorithm, and apply that to the same benchmark. A detailed analysis of the results show that the use of multi-objective optimization benefits the overall performance on a subset of the benchmarks while keeping the results similar to the single-objective approach on the remainder of the datasets. Specifically to the small datasets, we observe a small (and statistically insignificant) improvement of the results suggesting that further strategies must be explored.
Related papers
- Interpolation pour l'augmentation de donnees : Application à la gestion des adventices de la canne a sucre a la Reunion [10.945947159224302]
This study explores techniques for the augmentation of geo-referenced data.
The aim is to predict the presence of Commelina benghalensis L. in sugarcane plots in La R'eunion.
arXiv Detail & Related papers (2025-01-10T11:02:13Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Improving Out-of-Distribution Generalization of Neural Rerankers with
Contextualized Late Interaction [52.63663547523033]
Late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [] vector to compute the similarity score.
We show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures.
arXiv Detail & Related papers (2023-02-13T18:42:17Z) - Complexity Measures for Multi-objective Symbolic Regression [2.4087148947930634]
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted.
We study which complexity measures are most appropriately used in symbolic regression when performing multi- objective optimization with NSGA-II.
arXiv Detail & Related papers (2021-09-01T08:22:41Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors.
A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.