Predicting Rare Events by Shrinking Towards Proportional Odds
- URL: http://arxiv.org/abs/2305.18700v1
- Date: Tue, 30 May 2023 02:50:08 GMT
- Title: Predicting Rare Events by Shrinking Towards Proportional Odds
- Authors: Gregory Faletto and Jacob Bien
- Abstract summary: We show that the more abundant data in earlier steps may be leveraged to improve estimation of probabilities of rare events.
We present PRESTO, a relaxation of the proportional odds model for ordinal regression.
We prove that PRESTO consistently estimates the decision boundary weights under a sparsity assumption.
- Score: 1.599072005190786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training classifiers is difficult with severe class imbalance, but many rare
events are the culmination of a sequence with much more common intermediate
outcomes. For example, in online marketing a user first sees an ad, then may
click on it, and finally may make a purchase; estimating the probability of
purchases is difficult because of their rarity. We show both theoretically and
through data experiments that the more abundant data in earlier steps may be
leveraged to improve estimation of probabilities of rare events. We present
PRESTO, a relaxation of the proportional odds model for ordinal regression.
Instead of estimating weights for one separating hyperplane that is shifted by
separate intercepts for each of the estimated Bayes decision boundaries between
adjacent pairs of categorical responses, we estimate separate weights for each
of these transitions. We impose an L1 penalty on the differences between
weights for the same feature in adjacent weight vectors in order to shrink
towards the proportional odds model. We prove that PRESTO consistently
estimates the decision boundary weights under a sparsity assumption. Synthetic
and real data experiments show that our method can estimate rare probabilities
in this setting better than both logistic regression on the rare category,
which fails to borrow strength from more abundant categories, and the
proportional odds model, which is too inflexible.
Related papers
- Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Nonparametric logistic regression with deep learning [1.2509746979383698]
In the nonparametric logistic regression, the Kullback-Leibler divergence could diverge easily.
Instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator.
As an important application, we derive the convergence rates of the NPMLE with deep neural networks.
arXiv Detail & Related papers (2024-01-23T04:31:49Z) - Uncertainty Voting Ensemble for Imbalanced Deep Regression [20.176217123752465]
In this paper, we introduce UVOTE, a method for learning from imbalanced data.
We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty.
We show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates.
arXiv Detail & Related papers (2023-05-24T14:12:21Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - DropLoss for Long-Tail Instance Segmentation [56.162929199998075]
We develop DropLoss, a novel adaptive loss to compensate for the imbalance between rare and frequent categories.
We show state-of-the-art mAP across rare, common, and frequent categories on the LVIS dataset.
arXiv Detail & Related papers (2021-04-13T17:59:22Z) - Distributionally Robust Parametric Maximum Likelihood Estimation [13.09499764232737]
We propose a distributionally robust maximum likelihood estimator that minimizes the worst-case expected log-loss uniformly over a parametric nominal distribution.
Our novel robust estimator also enjoys statistical consistency and delivers promising empirical results in both regression and classification tasks.
arXiv Detail & Related papers (2020-10-11T19:05:49Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.