Continuously Generalized Ordinal Regression for Linear and Deep Models
- URL: http://arxiv.org/abs/2202.07005v1
- Date: Mon, 14 Feb 2022 19:49:05 GMT
- Title: Continuously Generalized Ordinal Regression for Linear and Deep Models
- Authors: Fred Lu, Francis Ferraro, Edward Raff
- Abstract summary: Ordinal regression is a classification task where classes have an order and prediction error increases the further the predicted class is from the true class.
We propose a new approach for modeling ordinal data that allows class-specific hyperplane slopes.
Our method significantly outperforms the standard ordinal logistic model over a thorough set of ordinal regression benchmark datasets.
- Score: 41.03778663275373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ordinal regression is a classification task where classes have an order and
prediction error increases the further the predicted class is from the true
class. The standard approach for modeling ordinal data involves fitting
parallel separating hyperplanes that optimize a certain loss function. This
assumption offers sample efficient learning via inductive bias, but is often
too restrictive in real-world datasets where features may have varying effects
across different categories. Allowing class-specific hyperplane slopes creates
generalized logistic ordinal regression, increasing the flexibility of the
model at a cost to sample efficiency. We explore an extension of the
generalized model to the all-thresholds logistic loss and propose a
regularization approach that interpolates between these two extremes. Our
method, which we term continuously generalized ordinal logistic, significantly
outperforms the standard ordinal logistic model over a thorough set of ordinal
regression benchmark datasets. We further extend this method to deep learning
and show that it achieves competitive or lower prediction error compared to
previous models over a range of datasets and modalities. Furthermore, two
primary alternative models for deep learning ordinal regression are shown to be
special cases of our framework.
Related papers
- Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Adaptive Optimization for Prediction with Missing Data [6.800113478497425]
We show that some adaptive linear regression models are equivalent to learning an imputation rule and a downstream linear regression model simultaneously.
In settings where data is strongly not missing at random, our methods achieve a 2-10% improvement in out-of-sample accuracy.
arXiv Detail & Related papers (2024-02-02T16:35:51Z) - Scalable Estimation for Structured Additive Distributional Regression [0.0]
We propose a novel backfitting algorithm, which is based on the ideas of gradient descent and can deal virtually with any amount of data on a conventional laptop.
Performance is evaluated using an extensive simulation study and an exceptionally challenging and unique example of lightning count prediction over Austria.
arXiv Detail & Related papers (2023-01-13T14:59:42Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Ordinal-ResLogit: Interpretable Deep Residual Neural Networks for
Ordered Choices [6.982614422666432]
We develop a fully interpretable deep learning-based ordinal regression model.
Formulations for market share, substitution patterns, and elasticities are derived.
Our results show that Ordinal-ResLogit outperforms the traditional ordinal regression model for both datasets.
arXiv Detail & Related papers (2022-04-20T02:14:28Z) - Nonparametric Functional Analysis of Generalized Linear Models Under
Nonlinear Constraints [0.0]
This article introduces a novel nonparametric methodology for Generalized Linear Models.
It combines the strengths of the binary regression and latent variable formulations for categorical data.
It extends recently published parametric versions of the methodology and generalizes it.
arXiv Detail & Related papers (2021-10-11T04:49:59Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Split Modeling for High-Dimensional Logistic Regression [0.2676349883103404]
A novel method is proposed to an ensemble logistic classification model briefly compiled.
Our method learns how to exploit the bias-off resulting in excellent prediction accuracy.
An open-source software library implementing the proposed method is discussed.
arXiv Detail & Related papers (2021-02-17T05:57:26Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.