Loss Function Entropy Regularization for Diverse Decision Boundaries
- URL: http://arxiv.org/abs/2205.00224v1
- Date: Sat, 30 Apr 2022 10:16:41 GMT
- Title: Loss Function Entropy Regularization for Diverse Decision Boundaries
- Authors: Chong Sue Sin
- Abstract summary: Loss Function Entropy Regularization (LFER), are regularization terms to be added upon the pre-training and contrastive learning objective functions.
We show that LFER can produce an ensemble where each have accuracy comparable to the state-of-the-art, yet have varied latent decision boundaries.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Is it possible to train several classifiers to perform meaningful
crowd-sourcing to produce a better prediction label set without any
ground-truth annotation? In this paper, we will attempt to modify the
contrastive learning objectives to automatically train a self-complementing
ensemble to produce a state-of-the-art prediction on the CIFAR10 and
CIFAR100-20 task. This paper will present a remarkably simple method to modify
a single unsupervised classification pipeline to automatically generate an
ensemble of neural networks with varied decision boundaries to learn a larger
feature set of classes. Loss Function Entropy Regularization (LFER), are
regularization terms to be added upon the pre-training and contrastive learning
objective functions, gives us a gear to modify the entropy state of the output
space of unsupervised learning, thereby diversifying the latent representation
of decision boundaries of neural networks. Ensemble trained with LFER have
higher successful prediction accuracy for samples near decision boundaries.
LFER is a effective gear to perturb decision boundaries, and has proven to be
able to produce classifiers that beat state-of-the-art at contrastive learning
stage. Experiments show that LFER can produce an ensemble where each have
accuracy comparable to the state-of-the-art, yet have each have varied latent
decision boundaries. It allows us to essence perform meaningful verification
for samples near decision boundaries, encouraging correct classification of
near-boundary samples. By compounding the probability of correct prediction of
a single sample amongst an ensemble of neural network trained, our method is
able to improve upon a single classifier by denoising and affirming correct
feature mappings.
Related papers
- Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks [0.7106986689736825]
We present textitOnline Bayesian Imbalanced Learning (OBIL), a principled framework that decouples likelihood-ratio estimation from class-prior assumptions.<n>Our approach builds on the established connection between Bregman divergences and proper scoring rules to show that deep networks trained with such losses produce posterior probability estimates.<n>We prove that these likelihood-ratio estimates remain valid under arbitrary changes in class priors and cost structures, requiring only a threshold adjustment for optimal Bayes decisions.
arXiv Detail & Related papers (2026-02-08T21:23:00Z) - Toward Reliable Machine Unlearning: Theory, Algorithms, and Evaluation [1.7767466724342065]
We introduce Adrial Machine UNlearning (AMUN), which surpasses prior state-of-the-art methods for image classification based on SOTA MIA scores.<n>We show that existing methods fail in replicating a retrained model's behavior by introducing a nearest-neighbor membership inference attack (MIA-NN)<n>We then propose a fine-tuning objective that mitigates this leakage by approximating, for forget-class inputs, the distribution over remaining classes that a model retrained from scratch would produce.
arXiv Detail & Related papers (2025-12-07T20:57:25Z) - LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems [95.35293543918762]
Large language models (LLMs) often generate unreliable answers, while uncertainty methods fail to fully distinguish correct from incorrect predictions.<n>We address this issue through the lens of false discovery rate (FDR) control, ensuring that among all accepted predictions, the proportion of errors does not exceed a target risk level.<n>We propose LEC, which reinterprets selective prediction as a constrained decision problem by enforcing a Linear Expectation Constraint.
arXiv Detail & Related papers (2025-12-01T11:27:09Z) - What Does It Take to Build a Performant Selective Classifier? [30.90225954725644]
Bayes noise, approximation error, ranking error, statistical noise, and implementation- or shift-induced slack are studied.<n>We validate our decomposition on synthetic two-moons data and on real-world vision and language benchmarks.<n>Our results confirm that (i) Bayes noise and limited model capacity can account for substantial gaps, (ii) only richer, feature-aware calibrators meaningfully improve score ordering, and (iii) data shift introduces a separate slack that demands distributionally robust training.
arXiv Detail & Related papers (2025-10-23T05:48:40Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient ensembling method for self-attention networks.<n>The method not only outperforms state-of-the-art implicit techniques like BatchEnsemble, but even matches or exceeds the accuracy of an Explicit Ensemble.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Self-Supervised Learning by Estimating Twin Class Distributions [26.7828253129684]
We present TWIST, a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way.
We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images.
Specifically, we minimize the entropy of the distribution for each sample to make the class prediction for each sample and maximize the entropy of the mean distribution to make the predictions of different samples diverse.
arXiv Detail & Related papers (2021-10-14T14:39:39Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Deep Ordinal Regression with Label Diversity [19.89482062012177]
We propose that using several discrete data representations simultaneously can improve neural network learning.
Our approach is end-to-end differentiable and can be added as a simple extension to conventional learning methods.
arXiv Detail & Related papers (2020-06-29T08:23:43Z) - Auto-Ensemble: An Adaptive Learning Rate Scheduling based Deep Learning
Model Ensembling [11.324407834445422]
This paper proposes Auto-Ensemble (AE) to collect checkpoints of deep learning model and ensemble them automatically.
The advantage of this method is to make the model converge to various local optima by scheduling the learning rate in once training.
arXiv Detail & Related papers (2020-03-25T08:17:31Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.