KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
- URL: http://arxiv.org/abs/2110.02523v1
- Date: Wed, 6 Oct 2021 06:17:05 GMT
- Title: KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
- Authors: Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang
- Abstract summary: Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss.
These problems can be improved by learning representations that focus on similarities in the same class and contradictions when making predictions.
We introduce the KNearest Neighbors in pre-trained model fine-tuning tasks in this paper.
- Score: 61.063988689601416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained models are widely used in fine-tuning downstream tasks with
linear classifiers optimized by the cross-entropy loss, which might face
robustness and stability problems. These problems can be improved by learning
representations that focus on similarities in the same class and contradictions
in different classes when making predictions. In this paper, we utilize the
K-Nearest Neighbors Classifier in pre-trained model fine-tuning. For this KNN
classifier, we introduce a supervised momentum contrastive learning framework
to learn the clustered representations of the supervised downstream tasks.
Extensive experiments on text classification tasks and robustness tests show
that by incorporating KNNs with the traditional fine-tuning process, we can
obtain significant improvements on the clean accuracy in both rich-source and
few-shot settings and can improve the robustness against adversarial attacks.
\footnote{all codes is available at https://github.com/LinyangLee/KNN-BERT}
Related papers
- Revisiting k-NN for Fine-tuning Pre-trained Language Models [25.105882538429743]
We revisit k-Nearest-Neighbor (kNN) classifiers for augmenting the PLMs-based classifiers.
At the heart of our approach is the implementation of kNN-calibrated training, which treats predicted results as indicators for easy versus hard examples.
We conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings.
arXiv Detail & Related papers (2023-04-18T15:28:47Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - An Orthogonal Classifier for Improving the Adversarial Robustness of
Neural Networks [21.13588742648554]
Recent efforts have shown that imposing certain modifications on classification layer can improve the robustness of the neural networks.
We explicitly construct a dense orthogonal weight matrix whose entries have the same magnitude, leading to a novel robust classifier.
Our method is efficient and competitive to many state-of-the-art defensive approaches.
arXiv Detail & Related papers (2021-05-19T13:12:14Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.