Gumbel-Softmax Selective Networks
- URL: http://arxiv.org/abs/2211.10564v1
- Date: Sat, 19 Nov 2022 02:20:14 GMT
- Title: Gumbel-Softmax Selective Networks
- Authors: Mahmoud Salem, Mohamed Osama Ahmed, Frederick Tung and Gabriel
Oliveira
- Abstract summary: This paper presents a general method for training selective networks that enables selection within an end-to-end differentiable training framework.
Experiments on public datasets demonstrate the potential of Gumbel-softmax selective networks for selective regression and classification.
- Score: 10.074545631396385
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ML models often operate within the context of a larger system that can adapt
its response when the ML model is uncertain, such as falling back on safe
defaults or a human in the loop. This commonly encountered operational context
calls for principled techniques for training ML models with the option to
abstain from predicting when uncertain. Selective neural networks are trained
with an integrated option to abstain, allowing them to learn to recognize and
optimize for the subset of the data distribution for which confident
predictions can be made. However, optimizing selective networks is challenging
due to the non-differentiability of the binary selection function (the discrete
decision of whether to predict or abstain). This paper presents a general
method for training selective networks that leverages the Gumbel-softmax
reparameterization trick to enable selection within an end-to-end
differentiable training framework. Experiments on public datasets demonstrate
the potential of Gumbel-softmax selective networks for selective regression and
classification.
Related papers
- Model Agnostic Explainable Selective Regression via Uncertainty
Estimation [15.331332191290727]
This paper presents a novel approach to selective regression that utilizes model-agnostic non-parametric uncertainty estimation.
Our proposed framework showcases superior performance compared to state-of-the-art selective regressors.
We implement our selective regression method in the open-source Python package doubt and release the code used to reproduce our experiments.
arXiv Detail & Related papers (2023-11-15T17:40:48Z) - A Neural Network Based Choice Model for Assortment Optimization [5.173001988341294]
We investigate whether a single neural network architecture can predict purchase probabilities for datasets from various contexts.
Next, we develop an assortment optimization formulation that is solvable by off-the-shelf integer programming solvers.
arXiv Detail & Related papers (2023-08-10T15:01:52Z) - Sparse-Input Neural Network using Group Concave Regularization [10.103025766129006]
Simultaneous feature selection and non-linear function estimation are challenging in neural networks.
We propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
arXiv Detail & Related papers (2023-07-01T13:47:09Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Neural Networks beyond explainability: Selective inference for sequence
motifs [5.620334754517149]
We introduce SEISM, a selective inference procedure to test the association between extracted features and the predicted phenotype.
We adapt existing sampling-based selective inference procedures by quantizing this selection over an infinite set to a large but finite grid.
We show that sampling under a specific choice of parameters is sufficient to characterize the composite null hypothesis typically used for selective inference.
arXiv Detail & Related papers (2022-12-23T10:49:07Z) - Data-Driven Offline Decision-Making via Invariant Representation
Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction.
A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good.
In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z) - Correcting Model Bias with Sparse Implicit Processes [0.9187159782788579]
We show that Sparse Implicit Processes (SIP) is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model.
We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.
arXiv Detail & Related papers (2022-07-21T18:00:01Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.