Related papers: Unsupervised Calibration through Prior Adaptation for Text Classification using Large Language Models

Unsupervised Calibration through Prior Adaptation for Text Classification using Large Language Models

URL: http://arxiv.org/abs/2307.06713v3
Date: Wed, 9 Aug 2023 07:40:43 GMT
Title: Unsupervised Calibration through Prior Adaptation for Text Classification using Large Language Models
Authors: Lautaro Estienne, Luciana Ferrer, Mat\'ias Vera, Pablo Piantanida
Abstract summary: We propose an approach to adapt the prior class distribution to perform text classification tasks without the need for labelled samples. Results show that these methods outperform the un-adapted model for different number of training shots in the prompt.
Score: 37.39843935632105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A wide variety of natural language tasks are currently being addressed with large-scale language models (LLMs). These models are usually trained with a very large amount of unsupervised text data and adapted to perform a downstream natural language task using methods like fine-tuning, calibration or in-context learning. In this work, we propose an approach to adapt the prior class distribution to perform text classification tasks without the need for labelled samples and only few in-domain sample queries. The proposed approach treats the LLM as a black box, adding a stage where the model posteriors are calibrated to the task. Results show that these methods outperform the un-adapted model for different number of training shots in the prompt and a previous approach were calibration is performed without using any adaptation data.

Related papers

Your Pretrained Model Tells the Difficulty Itself: A Self-Adaptive Curriculum Learning Paradigm for Natural Language Understanding [53.63482987410292]
We present a self-adaptive curriculum learning paradigm that prioritizes fine-tuning examples based on difficulty scores predicted by pre-trained language models.<n>We evaluate our method on four natural language understanding (NLU) datasets covering both binary and multi-class classification tasks.
arXiv Detail & Related papers (2025-07-13T19:36:17Z)
Large Language Models in the Task of Automatic Validation of Text Classifier Predictions [55.2480439325792]
Machine learning models for text classification are trained to predict a class for a given text.<n>To do this, training and validation samples must be prepared, and each text is assigned a class.<n>Human annotators are usually assigned by human annotators with different expertise levels, depending on the specific classification task.<n>This paper proposes several approaches to replace human annotators with Large Language Models.
arXiv Detail & Related papers (2025-05-24T13:19:03Z)
Bayesian Test-Time Adaptation for Vision-Language Models [51.93247610195295]
Test-time adaptation with pre-trained vision-language models, such as CLIP, aims to adapt the model to new, potentially out-of-distribution test data. We propose a novel approach, textbfBayesian textbfClass textbfAdaptation (BCA), which in addition to continuously updating class embeddings to adapt likelihood, also uses the posterior of incoming samples to continuously update the prior for each class embedding.
arXiv Detail & Related papers (2025-03-12T10:42:11Z)
Ensembling Finetuned Language Models for Text Classification [55.15643209328513]
Finetuning is a common practice across different communities to adapt pretrained models to particular tasks. ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates. We present a metadataset with predictions from five large finetuned models on six datasets and report results of different ensembling strategies.
arXiv Detail & Related papers (2024-10-25T09:15:54Z)
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback. Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data. We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z)
Irreducible Curriculum for Language Model Pretraining [46.895234111411426]
We propose irreducible curriculum as a curriculum learning algorithm for language model pretraining. Our experiments on the RedPajama-1B dataset demonstrate a consistent improvement on validation perplexity across all 7 domains.
arXiv Detail & Related papers (2023-10-23T22:41:33Z)
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models [23.881825575095945]
Large pre-trained language models (PLMs) have demonstrated strong performance on natural language understanding (NLU) tasks through fine-tuning. However, fine-tuned models still suffer from overconfident predictions, especially in out-of-domain settings. We demonstrate that the PLMs are well-calibrated on the masked language modeling task with robust predictive confidence under domain shift. We show that preserving pre-trained features can improve the calibration of fine-tuned language models.
arXiv Detail & Related papers (2023-05-30T17:35:31Z)
Improving Few-Shot Performance of Language Models via Nearest Neighbor Calibration [12.334422701057674]
We propose a novel nearest-neighbor calibration framework for in-context learning. It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances. Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning.
arXiv Detail & Related papers (2022-12-05T12:49:41Z)
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR) Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model. We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
Bayesian Active Learning with Pretrained Language Models [9.161353418331245]
Active Learning (AL) is a method to iteratively select data for annotation from a pool of unlabeled data. Previous AL approaches have been limited to task-specific models that are trained from scratch at each iteration. We introduce BALM; Bayesian Active Learning with pretrained language models.
arXiv Detail & Related papers (2021-04-16T19:07:31Z)
Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget. We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework. We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z)
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks. We introduce a new scoring method that casts a plausibility ranking task in a full-text format. We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks. A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains. Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z)
Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.