Deep Neural Networks for Rank-Consistent Ordinal Regression Based On
Conditional Probabilities
- URL: http://arxiv.org/abs/2111.08851v5
- Date: Thu, 1 Jun 2023 00:40:22 GMT
- Title: Deep Neural Networks for Rank-Consistent Ordinal Regression Based On
Conditional Probabilities
- Authors: Xintong Shi, Wenzhi Cao, Sebastian Raschka
- Abstract summary: Ordinal regression methods for deep neural networks address ordinal response variables.
CORAL method achieves rank consistency among its output layer tasks by imposing a weight-sharing constraint.
We propose a new method for rank-consistent ordinal regression without this limitation.
- Score: 3.093890460224435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent times, deep neural networks achieved outstanding predictive
performance on various classification and pattern recognition tasks. However,
many real-world prediction problems have ordinal response variables, and this
ordering information is ignored by conventional classification losses such as
the multi-category cross-entropy. Ordinal regression methods for deep neural
networks address this. One such method is the CORAL method, which is based on
an earlier binary label extension framework and achieves rank consistency among
its output layer tasks by imposing a weight-sharing constraint. However, while
earlier experiments showed that CORAL's rank consistency is beneficial for
performance, it is limited by a weight-sharing constraint in a neural network's
fully connected output layer, which may restrict the expressiveness and
capacity of a network trained using CORAL. We propose a new method for
rank-consistent ordinal regression without this limitation. Our rank-consistent
ordinal regression framework (CORN) achieves rank consistency by a novel
training scheme. This training scheme uses conditional training sets to obtain
the unconditional rank probabilities through applying the chain rule for
conditional probability distributions. Experiments on various datasets
demonstrate the efficacy of the proposed method to utilize the ordinal target
information, and the absence of the weight-sharing restriction improves the
performance substantially compared to the CORAL reference approach.
Additionally, the suggested CORN method is not tied to any specific
architecture and can be utilized with any deep neural network classifier to
train it for ordinal regression tasks.
Related papers
- Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training.
We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z) - Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network.
We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems.
We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z) - Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters [69.24377241408851]
Overfitting to the source domain is a common issue in gradient-based training of deep neural networks.
We propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters.
arXiv Detail & Related papers (2023-10-11T10:21:34Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image
Classification [24.91896527342631]
Graph convolutional network (GCN) based models have shown impressive performance.
Deep learning frameworks based on point estimation suffer from low generalization and inability to quantify the classification results uncertainty.
In this paper, we propose a Bayesian layer with Bayesian idea as an insertion layer into point estimation based neural networks.
A Generative Adversarial Network (GAN) is built to solve the sample imbalance problem of HSI dataset.
arXiv Detail & Related papers (2022-11-14T12:56:56Z) - Compare Where It Matters: Using Layer-Wise Regularization To Improve
Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data.
One main limitation is the performance degradation that occurs when data is heterogeneously distributed.
We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z) - Universally Rank Consistent Ordinal Regression in Neural Networks [4.462334751640166]
Recent methods have resorted to converting ordinal regression into a series of extended binary classification subtasks.
Here we demonstrate that the subtask probabilities form a Markov chain.
We show how to straightforwardly modify neural network architectures to exploit this fact and thereby constrain predictions to be universally rank consistent.
arXiv Detail & Related papers (2021-10-14T15:44:08Z) - Deep Ordinal Regression with Label Diversity [19.89482062012177]
We propose that using several discrete data representations simultaneously can improve neural network learning.
Our approach is end-to-end differentiable and can be added as a simple extension to conventional learning methods.
arXiv Detail & Related papers (2020-06-29T08:23:43Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.