Revisiting k-NN for Fine-tuning Pre-trained Language Models
- URL: http://arxiv.org/abs/2304.09058v2
- Date: Sun, 18 Jun 2023 02:51:29 GMT
- Title: Revisiting k-NN for Fine-tuning Pre-trained Language Models
- Authors: Lei Li, Jing Chen, Bozhong Tian, Ningyu Zhang
- Abstract summary: We revisit k-Nearest-Neighbor (kNN) classifiers for augmenting the PLMs-based classifiers.
At the heart of our approach is the implementation of kNN-calibrated training, which treats predicted results as indicators for easy versus hard examples.
We conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings.
- Score: 25.105882538429743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained Language Models (PLMs), as parametric-based eager learners, have
become the de-facto choice for current paradigms of Natural Language Processing
(NLP). In contrast, k-Nearest-Neighbor (kNN) classifiers, as the lazy learning
paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we
revisit kNN classifiers for augmenting the PLMs-based classifiers. From the
methodological level, we propose to adopt kNN with textual representations of
PLMs in two steps: (1) Utilize kNN as prior knowledge to calibrate the training
process. (2) Linearly interpolate the probability distribution predicted by kNN
with that of the PLMs' classifier. At the heart of our approach is the
implementation of kNN-calibrated training, which treats predicted results as
indicators for easy versus hard examples during the training process. From the
perspective of the diversity of application scenarios, we conduct extensive
experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and
fully-supervised settings, respectively, across eight diverse end-tasks. We
hope our exploration will encourage the community to revisit the power of
classical methods for efficient NLP. Code and datasets are available in
https://github.com/zjunlp/Revisit-KNN.
Related papers
- Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning [76.00798972439004]
Collaborative Sample Selection (CSS) removes noisy samples from identified clean set.
We introduce a co-training mechanism with a contrastive loss in semi-supervised learning.
arXiv Detail & Related papers (2023-10-24T05:37:20Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Nearest Neighbor Zero-Shot Inference [68.56747574377215]
kNN-Prompt is a technique to use k-nearest neighbor (kNN) retrieval augmentation for zero-shot inference with language models (LMs)
fuzzy verbalizers leverage the sparse kNN distribution for downstream tasks by automatically associating each classification label with a set of natural language tokens.
Experiments show that kNN-Prompt is effective for domain adaptation with no further training, and that the benefits of retrieval increase with the size of the model used for kNN retrieval.
arXiv Detail & Related papers (2022-05-27T07:00:59Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier [61.063988689601416]
Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss.
These problems can be improved by learning representations that focus on similarities in the same class and contradictions when making predictions.
We introduce the KNearest Neighbors in pre-trained model fine-tuning tasks in this paper.
arXiv Detail & Related papers (2021-10-06T06:17:05Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Multi-Sample Online Learning for Probabilistic Spiking Neural Networks [43.8805663900608]
Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains for inference and learning.
This paper introduces an online learning rule based on generalized expectation-maximization (GEM)
Experimental results on structured output memorization and classification on a standard neuromorphic data set demonstrate significant improvements in terms of log-likelihood, accuracy, and calibration.
arXiv Detail & Related papers (2020-07-23T10:03:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.