Dynamic Memory Induction Networks for Few-Shot Text Classification
- URL: http://arxiv.org/abs/2005.05727v1
- Date: Tue, 12 May 2020 12:41:14 GMT
- Title: Dynamic Memory Induction Networks for Few-Shot Text Classification
- Authors: Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu
- Abstract summary: This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification.
The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
- Score: 84.88381813651971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot
text classification. The model utilizes dynamic routing to provide more
flexibility to memory-based few-shot learning in order to better adapt the
support sets, which is a critical capacity of few-shot classification models.
Based on that, we further develop induction models with query information,
aiming to enhance the generalization ability of meta-learning. The proposed
model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset,
improving the best performance (accuracy) by 2~4%. Detailed analysis is further
performed to show the effectiveness of each component.
Related papers
- Dynamic Feature Learning and Matching for Class-Incremental Learning [20.432575325147894]
Class-incremental learning (CIL) has emerged as a means to learn new classes without catastrophic forgetting of previous classes.
We propose the Dynamic Feature Learning and Matching (DFLM) model in this paper.
Our proposed model achieves significant performance improvements over existing methods.
arXiv Detail & Related papers (2024-05-14T12:17:19Z) - Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models [37.492637804756164]
We introduce a versatile adaptation approach that can effectively work under all three settings.
We propose the dual memory networks that comprise dynamic and static memory components.
Our approach is tested across 11 datasets under the three task settings.
arXiv Detail & Related papers (2024-03-26T10:54:07Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Scaling Pre-trained Language Models to Deeper via Parameter-efficient
Architecture [68.13678918660872]
We design a more capable parameter-sharing architecture based on matrix product operator (MPO)
MPO decomposition can reorganize and factorize the information of a parameter matrix into two parts.
Our architecture shares the central tensor across all layers for reducing the model size.
arXiv Detail & Related papers (2023-03-27T02:34:09Z) - Learning to Augment via Implicit Differentiation for Domain
Generalization [107.9666735637355]
Domain generalization (DG) aims to overcome the problem by leveraging multiple source domains to learn a domain-generalizable model.
In this paper, we propose a novel augmentation-based DG approach, dubbed AugLearn.
AugLearn shows effectiveness on three standard DG benchmarks, PACS, Office-Home and Digits-DG.
arXiv Detail & Related papers (2022-10-25T18:51:51Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning Intermediate Representations using Graph Neural Networks for
NUMA and Prefetchers Optimization [1.3999481573773074]
This paper demonstrates how the static Intermediate Representation (IR) of the code can guide NUMA/prefetcher optimizations without the prohibitive cost of performance profiling.
We show that our static intermediate representation based model achieves 80% of the performance gains provided by expensive dynamic performance profiling based strategies.
arXiv Detail & Related papers (2022-03-01T16:51:30Z) - Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning [32.3217883750605]
We learn the dynamic kernels of a convolution network as a function of the task at hand, enabling faster generalization.
We empirically show that our model improves performance on few-shot classification and detection tasks.
arXiv Detail & Related papers (2021-12-07T04:52:36Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.