CAIRO: Decoupling Order from Scale in Regression
- URL: http://arxiv.org/abs/2602.14440v2
- Date: Sun, 22 Feb 2026 20:57:25 GMT
- Title: CAIRO: Decoupling Order from Scale in Regression
- Authors: Harri Vanhems, Yue Zhao, Peng Shi, Archer Y. Yang,
- Abstract summary: We propose a framework that decouples regression into two distinct stages.<n>In the first stage, we learn a scoring function by minimizing a scale-invariant ranking loss.<n>In the second, we recover the target scale via isotonic regression.
- Score: 13.755937210012883
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Standard regression methods typically optimize a single pointwise objective, such as mean squared error, which conflates the learning of ordering with the learning of scale. This coupling renders models vulnerable to outliers and heavy-tailed noise. We propose CAIRO (Calibrate After Initial Rank Ordering), a framework that decouples regression into two distinct stages. In the first stage, we learn a scoring function by minimizing a scale-invariant ranking loss; in the second, we recover the target scale via isotonic regression. We theoretically characterize a class of "Optimal-in-Rank-Order" objectives -- including variants of RankNet and Gini covariance -- and prove that they recover the ordering of the true conditional mean under mild assumptions. We further show that subsequent monotone calibration recovers the true regression function at the population level and mathematically guarantees that finite-sample predictions are strictly auto-calibrated. Empirically, CAIRO combines the representation learning of neural networks with the robustness of rank-based statistics. It matches the performance of state-of-the-art tree ensembles on tabular benchmarks and significantly outperforms standard regression objectives in regimes with heavy-tailed or heteroskedastic noise.
Related papers
- Toward Reliable Machine Unlearning: Theory, Algorithms, and Evaluation [1.7767466724342065]
We introduce Adrial Machine UNlearning (AMUN), which surpasses prior state-of-the-art methods for image classification based on SOTA MIA scores.<n>We show that existing methods fail in replicating a retrained model's behavior by introducing a nearest-neighbor membership inference attack (MIA-NN)<n>We then propose a fine-tuning objective that mitigates this leakage by approximating, for forget-class inputs, the distribution over remaining classes that a model retrained from scratch would produce.
arXiv Detail & Related papers (2025-12-07T20:57:25Z) - Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z) - Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification [75.01389991485098]
We show that ridge regression has qualitatively different behavior depending on the scale of the cluster mean vector.<n>In regimes where the scale is very large, the conditions that allow for benign overfitting turn out to be the same as those for the regression task.
arXiv Detail & Related papers (2025-03-11T01:45:42Z) - Generalization bounds for regression and classification on adaptive covering input domains [1.4141453107129398]
We focus on the generalization bound, which serves as an upper limit for the generalization error.
In the case of classification tasks, we treat the target function as a one-hot, a piece-wise constant function, and employ 0/1 loss for error measurement.
arXiv Detail & Related papers (2024-07-29T05:40:08Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Regularized Linear Regression for Binary Classification [20.710343135282116]
Regularized linear regression is a promising approach for binary classification problems in which the training set has noisy labels.
We show that for large enough regularization strength, the optimal weights concentrate around two values of opposite sign.
We observe that in many cases the corresponding "compression" of each weight to a single bit leads to very little loss in performance.
arXiv Detail & Related papers (2023-11-03T23:18:21Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Discrete-Constrained Regression for Local Counting Models [27.1177471719278]
Local counts, or the number of objects in a local area, is a continuous value by nature.
Recent state-of-the-art methods show that formulating counting as a classification task performs better than regression.
We show that this result is caused by imprecise ground truth local counts.
arXiv Detail & Related papers (2022-07-20T12:54:23Z) - THOR: Threshold-Based Ranking Loss for Ordinal Regression [17.384197085002686]
We present a regression-based ordinal regression algorithm for supervised classification of instances into ordinal categories.
We add a novel threshold-based pairwise loss function that aims at minimizing the regression error, which in turn minimizes the Mean Absolute Error (MAE) measure.
Experimental results on five real-world benchmarks demonstrate that the proposed algorithm achieves the best MAE results compared to state-of-the-art ordinal regression algorithms.
arXiv Detail & Related papers (2022-05-10T13:04:09Z) - Rethinking Depth Estimation for Multi-View Stereo: A Unified
Representation and Focal Loss [13.814071773168044]
We propose a novel representation, termed Unification, to unify the advantages of regression and classification.
It can directly constrain the cost volume like classification methods, but also realize the sub-pixel depth prediction like regression methods.
To excavate the potential of unification, we design a new loss function named Unified Focal Loss, which is more uniform and reasonable to combat the challenge of sample imbalance.
arXiv Detail & Related papers (2022-01-05T08:49:31Z) - Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives.
A ranked range is a consecutive sequence of sorted values of a set of real numbers.
We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z) - Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation [77.7420231319632]
We adapt contextual generation of categorical sequences to a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
We also demonstrate the use of correlated MC rollouts for binary-tree softmax models, which reduce the high generation cost in large vocabulary scenarios.
arXiv Detail & Related papers (2019-12-31T03:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.