Loss Function Entropy Regularization for Diverse Decision Boundaries
- URL: http://arxiv.org/abs/2205.00224v1
- Date: Sat, 30 Apr 2022 10:16:41 GMT
- Title: Loss Function Entropy Regularization for Diverse Decision Boundaries
- Authors: Chong Sue Sin
- Abstract summary: Loss Function Entropy Regularization (LFER), are regularization terms to be added upon the pre-training and contrastive learning objective functions.
We show that LFER can produce an ensemble where each have accuracy comparable to the state-of-the-art, yet have varied latent decision boundaries.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Is it possible to train several classifiers to perform meaningful
crowd-sourcing to produce a better prediction label set without any
ground-truth annotation? In this paper, we will attempt to modify the
contrastive learning objectives to automatically train a self-complementing
ensemble to produce a state-of-the-art prediction on the CIFAR10 and
CIFAR100-20 task. This paper will present a remarkably simple method to modify
a single unsupervised classification pipeline to automatically generate an
ensemble of neural networks with varied decision boundaries to learn a larger
feature set of classes. Loss Function Entropy Regularization (LFER), are
regularization terms to be added upon the pre-training and contrastive learning
objective functions, gives us a gear to modify the entropy state of the output
space of unsupervised learning, thereby diversifying the latent representation
of decision boundaries of neural networks. Ensemble trained with LFER have
higher successful prediction accuracy for samples near decision boundaries.
LFER is a effective gear to perturb decision boundaries, and has proven to be
able to produce classifiers that beat state-of-the-art at contrastive learning
stage. Experiments show that LFER can produce an ensemble where each have
accuracy comparable to the state-of-the-art, yet have each have varied latent
decision boundaries. It allows us to essence perform meaningful verification
for samples near decision boundaries, encouraging correct classification of
near-boundary samples. By compounding the probability of correct prediction of
a single sample amongst an ensemble of neural network trained, our method is
able to improve upon a single classifier by denoising and affirming correct
feature mappings.
Related papers
- Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Self-Supervised Learning by Estimating Twin Class Distributions [26.7828253129684]
We present TWIST, a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way.
We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images.
Specifically, we minimize the entropy of the distribution for each sample to make the class prediction for each sample and maximize the entropy of the mean distribution to make the predictions of different samples diverse.
arXiv Detail & Related papers (2021-10-14T14:39:39Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Deep Ordinal Regression with Label Diversity [19.89482062012177]
We propose that using several discrete data representations simultaneously can improve neural network learning.
Our approach is end-to-end differentiable and can be added as a simple extension to conventional learning methods.
arXiv Detail & Related papers (2020-06-29T08:23:43Z) - Auto-Ensemble: An Adaptive Learning Rate Scheduling based Deep Learning
Model Ensembling [11.324407834445422]
This paper proposes Auto-Ensemble (AE) to collect checkpoints of deep learning model and ensemble them automatically.
The advantage of this method is to make the model converge to various local optima by scheduling the learning rate in once training.
arXiv Detail & Related papers (2020-03-25T08:17:31Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.