Adaptive Meta-learner via Gradient Similarity for Few-shot Text
Classification
- URL: http://arxiv.org/abs/2209.04702v2
- Date: Fri, 28 Jul 2023 09:30:46 GMT
- Title: Adaptive Meta-learner via Gradient Similarity for Few-shot Text
Classification
- Authors: Tianyi Lei, Honghui Hu, Qiaoyang Luo, Dezhong Peng, Xu Wang
- Abstract summary: We propose a novel Adaptive Meta-learner via Gradient Similarity (AMGS) to improve the model generalization ability to a new task.
Experimental results on several benchmarks demonstrate that the proposed AMGS consistently improves few-shot text classification performance.
- Score: 11.035878821365149
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot text classification aims to classify the text under the few-shot
scenario. Most of the previous methods adopt optimization-based meta learning
to obtain task distribution. However, due to the neglect of matching between
the few amount of samples and complicated models, as well as the distinction
between useful and useless task features, these methods suffer from the
overfitting issue. To address this issue, we propose a novel Adaptive
Meta-learner via Gradient Similarity (AMGS) method to improve the model
generalization ability to a new task. Specifically, the proposed AMGS
alleviates the overfitting based on two aspects: (i) acquiring the potential
semantic representation of samples and improving model generalization through
the self-supervised auxiliary task in the inner loop, (ii) leveraging the
adaptive meta-learner via gradient similarity to add constraints on the
gradient obtained by base-learner in the outer loop. Moreover, we make a
systematic analysis of the influence of regularization on the entire framework.
Experimental results on several benchmarks demonstrate that the proposed AMGS
consistently improves few-shot text classification performance compared with
the state-of-the-art optimization-based meta-learning approaches.
Related papers
- Meta-tuning Loss Functions and Data Augmentation for Few-shot Object
Detection [7.262048441360132]
Few-shot object detection is an emerging topic in the area of few-shot learning and object detection.
We propose a training scheme that allows learning inductive biases that can boost few-shot detection.
The proposed approach yields interpretable loss functions, as opposed to highly parametric and complex few-shot meta-models.
arXiv Detail & Related papers (2023-04-24T15:14:16Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate
in Gradient Descent [20.47598828422897]
We propose textit-Meta-Regularization, a novel approach for the adaptive choice of the learning rate in first-order descent methods.
Our approach modifies the objective function by adding a regularization term, and casts the joint process parameters.
arXiv Detail & Related papers (2021-04-12T13:13:34Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Boosting Few-Shot Learning With Adaptive Margin Loss [109.03665126222619]
This paper proposes an adaptive margin principle to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems.
Extensive experiments demonstrate that the proposed method can boost the performance of current metric-based meta-learning approaches.
arXiv Detail & Related papers (2020-05-28T07:58:41Z) - PAC-Bayes meta-learning with implicit task-specific posteriors [37.32107678838193]
We introduce a new and rigorously-formulated PAC-Bayes meta-learning algorithm that solves few-shot learning.
We show that the models trained with our proposed meta-learning algorithm are well calibrated and accurate.
arXiv Detail & Related papers (2020-03-05T06:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.