LABO: Towards Learning Optimal Label Regularization via Bi-level
Optimization
- URL: http://arxiv.org/abs/2305.04971v1
- Date: Mon, 8 May 2023 18:04:18 GMT
- Title: LABO: Towards Learning Optimal Label Regularization via Bi-level
Optimization
- Authors: Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Philippe
Langlais
- Abstract summary: Regularization techniques are crucial to improving the generalization performance and training efficiency of deep neural networks.
We present a general framework for training with label regularization, which includes conventional LS but can also model instance-specific variants.
We propose an efficient way of learning LAbel regularization by devising a Bi-level Optimization (LABO) problem.
- Score: 25.188067240126422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regularization techniques are crucial to improving the generalization
performance and training efficiency of deep neural networks. Many deep learning
algorithms rely on weight decay, dropout, batch/layer normalization to converge
faster and generalize. Label Smoothing (LS) is another simple, versatile and
efficient regularization which can be applied to various supervised
classification tasks. Conventional LS, however, regardless of the training
instance assumes that each non-target class is equally likely. In this work, we
present a general framework for training with label regularization, which
includes conventional LS but can also model instance-specific variants. Based
on this formulation, we propose an efficient way of learning LAbel
regularization by devising a Bi-level Optimization (LABO) problem. We derive a
deterministic and interpretable solution of the inner loop as the optimal label
smoothing without the need to store the parameters or the output of a trained
model. Finally, we conduct extensive experiments and demonstrate our LABO
consistently yields improvement over conventional label regularization on
various fields, including seven machine translation and three image
classification tasks across various
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - L-TUNING: Synchronized Label Tuning for Prompt and Prefix in LLMs [0.0]
This paper introduces L-Tuning, an efficient fine-tuning approach for classification tasks within the Natural Language Inference (NLI) framework.
L-Tuning focuses on the fine-tuning of label tokens processed through a pre-trained Large Language Models (LLMs)
Our experimental results indicate a significant improvement in training efficiency and classification accuracy with L-Tuning compared to traditional approaches.
arXiv Detail & Related papers (2023-12-21T01:47:49Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - Class Adaptive Network Calibration [19.80805957502909]
We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks.
Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization.
arXiv Detail & Related papers (2022-11-28T06:05:31Z) - Evolving Multi-Label Fuzzy Classifier [5.53329677986653]
Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time.
We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner.
arXiv Detail & Related papers (2022-03-29T08:01:03Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z) - Exemplar Normalization for Learning Deep Representation [34.42934843556172]
This work investigates a novel dynamic learning-to-normalize (L2N) problem by proposing Exemplar Normalization (EN)
EN is able to learn different normalization methods for different convolutional layers and image samples of a deep network.
arXiv Detail & Related papers (2020-03-19T13:23:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.