Decoding-based Regression
- URL: http://arxiv.org/abs/2501.19383v2
- Date: Tue, 12 Aug 2025 03:41:13 GMT
- Title: Decoding-based Regression
- Authors: Xingyou Song, Dara Bahri,
- Abstract summary: We investigate the utility of causal sequence decoding models as numeric regression heads given any feature representation.<n>We find that, despite being trained in the usual way, decoder-based heads are as performant as standard pointwise heads when benchmarked over standard regression tasks.
- Score: 29.15816693410931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models have recently been shown capable of performing regression wherein numeric predictions are represented as decoded strings. In this work, we provide theoretical grounds for this capability and furthermore investigate the utility of causal sequence decoding models as numeric regression heads given any feature representation. We find that, despite being trained in the usual way - for next-token prediction via cross-entropy loss - decoder-based heads are as performant as standard pointwise heads when benchmarked over standard regression tasks, while being flexible enough to capture smooth numeric distributions, such as in the task of density estimation.
Related papers
- Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning [39.920697401868885]
We propose to unlock the potential of decoding-based regression via Reinforcement Learning (RL)<n>We formulate the generation process as a Markov Decision Process, utilizing sequence-level rewards to enforce global numerical coherence.<n>Our analysis further reveals that RL significantly enhances sampling efficiency and predictive precision, establishing decoding-based regression as a robust and accurate paradigm for general-purpose numerical prediction.
arXiv Detail & Related papers (2025-12-06T18:57:38Z) - Uncertainty Quantification for Regression using Proper Scoring Rules [76.24649098854219]
We introduce a unified UQ framework for regression based on proper scoring rules, such as CRPS, logarithmic, squared error, and quadratic scores.<n>We derive closed-form expressions for the uncertainty measures under practical parametric assumptions and show how to estimate them using ensembles of models.<n>Our broad evaluation on synthetic and real-world regression datasets provides guidance for selecting reliable UQ measures.
arXiv Detail & Related papers (2025-09-30T17:52:12Z) - Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z) - Generalized Regression with Conditional GANs [2.4171019220503402]
We propose to learn a prediction function whose outputs, when paired with the corresponding inputs, are indistinguishable from feature-label pairs in the training dataset.
We show that this approach to regression makes fewer assumptions on the distribution of the data we are fitting to and, therefore, has better representation capabilities.
arXiv Detail & Related papers (2024-04-21T01:27:47Z) - A Novel Approach in Solving Stochastic Generalized Linear Regression via
Nonconvex Programming [1.6874375111244329]
This paper considers a generalized linear regression model as a problem with chance constraints.
The results of the proposed algorithm were over 1 to 2 percent better than the ordinary logistic regression model.
arXiv Detail & Related papers (2024-01-16T16:45:51Z) - Beta quantile regression for robust estimation of uncertainty in the
presence of outliers [1.6377726761463862]
Quantile Regression can be used to estimate aleatoric uncertainty in deep neural networks.
We propose a robust solution for quantile regression that incorporates concepts from robust divergence.
arXiv Detail & Related papers (2023-09-14T01:18:57Z) - Streaming Active Learning for Regression Problems Using Regression via
Classification [12.572218568705376]
We propose to use the regression-via-classification framework for streaming active learning for regression.
Regression-via-classification transforms regression problems into classification problems so that streaming active learning methods can be applied directly to regression problems.
arXiv Detail & Related papers (2023-09-02T20:24:24Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Engression: Extrapolation through the Lens of Distributional Regression [2.519266955671697]
We propose a neural network-based distributional regression methodology called engression'
An engression model is generative in the sense that we can sample from the fitted conditional distribution and is also suitable for high-dimensional outcomes.
We show that engression can successfully perform extrapolation under some assumptions such as monotonicity, whereas traditional regression approaches such as least-squares or quantile regression fall short under the same assumptions.
arXiv Detail & Related papers (2023-07-03T08:19:00Z) - ResMem: Learn what you can and memorize the rest [79.19649788662511]
We propose the residual-memorization (ResMem) algorithm to augment an existing prediction model.
By construction, ResMem can explicitly memorize the training labels.
We show that ResMem consistently improves the test set generalization of the original prediction model.
arXiv Detail & Related papers (2023-02-03T07:12:55Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Neural Symbolic Regression that Scales [58.45115548924735]
We introduce the first symbolic regression method that leverages large scale pre-training.
We procedurally generate an unbounded set of equations, and simultaneously pre-train a Transformer to predict the symbolic equation from a corresponding set of input-output-pairs.
arXiv Detail & Related papers (2021-06-11T14:35:22Z) - Quantile Encoder: Tackling High Cardinality Categorical Features in
Regression Problems [2.3322477552758234]
We provide an in-depth analysis of how to tackle high cardinality categor-ical features with the quantile.
Our proposal outperforms state-of-the-encoders, including the traditional statistical mean target encoder.
We also describe how to expand the encoded values by creating a set of features with different quantiles.
arXiv Detail & Related papers (2021-05-27T11:56:13Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Simple Imputation Rules for Prediction with Missing Data: Contrasting
Theoretical Guarantees with Empirical Performance [7.642646077340124]
Missing data is a common issue in real-world datasets.
This paper studies the performance of impute-then-regress pipelines by contrasting theoretical and empirical evidence.
arXiv Detail & Related papers (2021-04-07T14:45:14Z) - Multitask Non-Autoregressive Model for Human Motion Prediction [33.98939145212708]
Non-auToregressive Model (NAT) is proposed with a complete non-autoregressive decoding scheme, as well as a context encoder and a positional encoding module.
Our approach is evaluated on Human3.6M and CMU-Mocap benchmarks and outperforms state-of-the-art autoregressive methods.
arXiv Detail & Related papers (2020-07-13T15:00:19Z) - Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring.
The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.