Rectifying Regression in Reinforcement Learning
- URL: http://arxiv.org/abs/2510.00885v1
- Date: Wed, 01 Oct 2025 13:32:07 GMT
- Title: Rectifying Regression in Reinforcement Learning
- Authors: Alex Ayoub, David Szepesvári, Alireza Baktiari, Csaba Szepesvári, Dale Schuurmans,
- Abstract summary: We show that mean absolute error is a better prediction objective than the traditional mean squared error for controlling the learned policy's suboptimality gap.<n>We present results that different loss functions are better aligned with these different regression objectives.
- Score: 51.28909745713678
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the impact of the loss function in value-based methods for reinforcement learning through an analysis of underlying prediction objectives. We theoretically show that mean absolute error is a better prediction objective than the traditional mean squared error for controlling the learned policy's suboptimality gap. Furthermore, we present results that different loss functions are better aligned with these different regression objectives: binary and categorical cross-entropy losses with the mean absolute error and squared loss with the mean squared error. We then provide empirical evidence that algorithms minimizing these cross-entropy losses can outperform those based on the squared loss in linear reinforcement learning.
Related papers
- Fair Regression under Demographic Parity: A Unified Framework [12.36726423996741]
Our framework is applicable to a broad spectrum of regression tasks.<n>We derive a novel characterization of the fair risk minimizer.<n>We illustrate the method's versatility through detailed discussions.
arXiv Detail & Related papers (2026-01-15T17:41:28Z) - A Comparative Study of Invariance-Aware Loss Functions for Deep Learning-based Gridless Direction-of-Arrival Estimation [19.100476521802243]
We propose new loss functions that are invariant to the scaling of the matrices.<n>We show that a scale-invariant loss outperforms its non-invariant counterpart but is inferior to the recently proposed subspace loss.
arXiv Detail & Related papers (2025-03-16T07:15:16Z) - Of Dice and Games: A Theory of Generalized Boosting [61.752303337418475]
We extend the celebrated theory of boosting to incorporate both cost-sensitive and multi-objective losses.<n>We develop a comprehensive theory of cost-sensitive and multi-objective boosting, providing a taxonomy of weak learning guarantees.<n>Our characterization relies on a geometric interpretation of boosting, revealing a surprising equivalence between cost-sensitive and multi-objective losses.
arXiv Detail & Related papers (2024-12-11T01:38:32Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
We show that minimizing point-prediction losses does not guarantee proper learning of latent relational information.<n>We propose a sampling-based method that solves this joint learning task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Investigating the Histogram Loss in Regression [16.83443393563771]
Histogram Loss is a regression approach to learning the conditional distribution of a target variable.
We show that the benefits of learning distributions in this setup come from improvements in optimization rather than modelling extra information.
arXiv Detail & Related papers (2024-02-20T23:29:41Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - Regression as Classification: Influence of Task Formulation on Neural
Network Features [16.239708754973865]
Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss.
practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance.
By focusing on two-layer ReLU networks, we explore how the implicit bias induced by gradient-based optimization could partly explain the phenomenon.
arXiv Detail & Related papers (2022-11-10T15:13:23Z) - Gradient descent follows the regularization path for general losses [33.155195855431344]
We show that for empirical risk minimization over linear predictors with arbitrary convex losses, the gradient-descent path and the algorithm-independent regularization path converge to the same direction.
We provide a justification for the widely-used exponentially-tailed losses.
arXiv Detail & Related papers (2020-06-19T17:01:25Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.