Neural Pseudo-Label Optimism for the Bank Loan Problem
- URL: http://arxiv.org/abs/2112.02185v1
- Date: Fri, 3 Dec 2021 22:46:31 GMT
- Title: Neural Pseudo-Label Optimism for the Bank Loan Problem
- Authors: Aldo Pacchiano, Shaun Singh, Edward Chou, Alexander C. Berg, Jakob
Foerster
- Abstract summary: We study a class of classification problems best exemplified by the emphbank loan problem.
In the case of linear models, this issue can be addressed by adding optimism directly into the model predictions.
We present Pseudo-Label Optimism (PLOT), a conceptually and computationally simple method for this setting applicable to Deep Neural Networks.
- Score: 78.66533961716728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a class of classification problems best exemplified by the
\emph{bank loan} problem, where a lender decides whether or not to issue a
loan. The lender only observes whether a customer will repay a loan if the loan
is issued to begin with, and thus modeled decisions affect what data is
available to the lender for future decisions. As a result, it is possible for
the lender's algorithm to ``get stuck'' with a self-fulfilling model. This
model never corrects its false negatives, since it never sees the true label
for rejected data, thus accumulating infinite regret. In the case of linear
models, this issue can be addressed by adding optimism directly into the model
predictions. However, there are few methods that extend to the function
approximation case using Deep Neural Networks. We present Pseudo-Label Optimism
(PLOT), a conceptually and computationally simple method for this setting
applicable to DNNs. \PLOT{} adds an optimistic label to the subset of decision
points the current model is deciding on, trains the model on all data so far
(including these points along with their optimistic labels), and finally uses
the resulting \emph{optimistic} model for decision making. \PLOT{} achieves
competitive performance on a set of three challenging benchmark problems,
requiring minimal hyperparameter tuning. We also show that \PLOT{} satisfies a
logarithmic regret guarantee, under a Lipschitz and logistic mean label model,
and under a separability condition on the data.
Related papers
- Beyond Closure Models: Learning Chaotic-Systems via Physics-Informed Neural Operators [78.64101336150419]
Predicting the long-term behavior of chaotic systems is crucial for various applications such as climate modeling.
An alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a temporalittext model.
We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation.
arXiv Detail & Related papers (2024-08-09T17:05:45Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Dirichlet-Based Prediction Calibration for Learning with Noisy Labels [40.78497779769083]
Learning with noisy labels can significantly hinder the generalization performance of deep neural networks (DNNs)
Existing approaches address this issue through loss correction or example selection methods.
We propose the textitDirichlet-based Prediction (DPC) method as a solution.
arXiv Detail & Related papers (2024-01-13T12:33:04Z) - A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - Pseudo Label Selection is a Decision Problem [0.0]
Pseudo-Labeling is a simple and effective approach to semi-supervised learning.
It requires criteria that guide the selection of pseudo-labeled data.
Overfitting can be propagated to the final model by choosing instances with overconfident but wrong predictions.
arXiv Detail & Related papers (2023-09-25T07:48:02Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - How to Learn when Data Reacts to Your Model: Performative Gradient
Descent [10.074466859579571]
We introduce performative gradient descent (PerfGD), which is the first algorithm which converges to the performatively optimal point.
PerfGD explicitly captures how changes in the model affects the data distribution and is simple to use.
arXiv Detail & Related papers (2021-02-15T17:49:36Z) - Bayes DistNet -- A Robust Neural Network for Algorithm Runtime
Distribution Predictions [1.8275108630751844]
Randomized algorithms are used in many state-of-the-art solvers for constraint satisfaction problems (CSP) and Boolean satisfiability (SAT) problems.
Previous state-of-the-art methods directly try to predict a fixed parametric distribution that the input instance follows.
This new model achieves robust predictive performance in the low observation setting, as well as handling censored observations.
arXiv Detail & Related papers (2020-12-14T01:15:39Z) - How do Decisions Emerge across Layers in Neural Models? Interpretation
with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability.
Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers.
This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.