Related papers: Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression

Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression

URL: http://arxiv.org/abs/2602.22422v1
Date: Wed, 25 Feb 2026 21:34:52 GMT
Title: Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression
Authors: Luciano Gerber, Huw Lloyd,
Abstract summary: Smooth-basis models such as Chebyshev regressors and radial basis function (RBF) networks are well established in numerical analysis.<n>We ask whether they can compete, benchmarking models across 55 regression datasets organised by application domain.
Score: 0.7734726150561086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response varies gradually with inputs. Despite these properties, smooth models seldom appear in tabular regression, where tree ensembles dominate. We ask whether they can compete, benchmarking models across 55 regression datasets organised by application domain. We develop an anisotropic RBF network with data-driven centre placement and gradient-based width optimisation, a ridge-regularised Chebyshev polynomial regressor, and a smooth-tree hybrid (Chebyshev model tree); all three are released as scikit-learn-compatible packages. We benchmark these against tree ensembles, a pre-trained transformer, and standard baselines, evaluating accuracy alongside generalisation behaviour. The transformer ranks first on accuracy across a majority of datasets, but its GPU dependence, inference latency, and dataset-size limits constrain deployment in the CPU-based settings common across applied science and industry. Among CPU-viable models, smooth models and tree ensembles are statistically tied on accuracy, but the former tend to exhibit tighter generalisation gaps. We recommend routinely including smooth-basis models in the candidate pool, particularly when downstream use benefits from tighter generalisation and gradually varying predictions.

Related papers

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network. It learns to approximate Bayesian inference on synthetic datasets drawn from a prior. It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost [53.746169882193456]
Recent works have proposed various sparse attention modules to overcome the quadratic cost of self-attention. We propose a model that resolves both problems by endowing each attention head with a mixed-membership Block Model. Our model outperforms previous efficient variants as well as the original Transformer with full attention.
arXiv Detail & Related papers (2022-10-27T15:30:52Z)
Regression Transformer: Concurrent Conditional Generation and Regression by Blending Numerical and Textual Tokens [3.421506449201873]
The Regression Transformer (RT) casts continuous properties as sequences of numerical tokens and encodes them jointly with conventional tokens. We propose several extensions to the XLNet objective and adopt an alternating training scheme to concurrently optimize property prediction and conditional text generation. This finds application particularly in property-driven, local exploration of the chemical or protein space.
arXiv Detail & Related papers (2022-02-01T08:57:31Z)
Evaluation of Tree Based Regression over Multiple Linear Regression for Non-normally Distributed Data in Battery Performance [0.5735035463793008]
This study explores the impact of data normality in building machine learning models. Tree-based regression models and multiple linear regressions models are each built from a highly skewed non-normal dataset.
arXiv Detail & Related papers (2021-11-03T20:28:24Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
Deep transformation models: Tackling complex regression problems with neural network based transformation models [0.0]
We present a deep transformation model for probabilistic regression. It estimates the whole conditional probability distribution, which is the most thorough way to capture uncertainty about the outcome. Our method works for complex input data, which we demonstrate by employing a CNN architecture on image data.
arXiv Detail & Related papers (2020-04-01T14:23:12Z)
A Numerical Transform of Random Forest Regressors corrects Systematically-Biased Predictions [0.0]
We find a systematic bias in predictions from random forest models. This bias is recapitulated in simple synthetic datasets. We use the training data to define a numerical transformation that fully corrects it.
arXiv Detail & Related papers (2020-03-16T21:18:06Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.