Related papers: Dynamic Memory Induction Networks for Few-Shot Text Classification

Dynamic Memory Induction Networks for Few-Shot Text Classification

URL: http://arxiv.org/abs/2005.05727v1
Date: Tue, 12 May 2020 12:41:14 GMT
Title: Dynamic Memory Induction Networks for Few-Shot Text Classification
Authors: Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu
Abstract summary: This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification. The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 24%.
Score: 84.88381813651971
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes Dynamic Memory Induction Networks (DMIN) for few-shot text classification. The model utilizes dynamic routing to provide more flexibility to memory-based few-shot learning in order to better adapt the support sets, which is a critical capacity of few-shot classification models. Based on that, we further develop induction models with query information, aiming to enhance the generalization ability of meta-learning. The proposed model achieves new state-of-the-art results on the miniRCV1 and ODIC dataset, improving the best performance (accuracy) by 2~4%. Detailed analysis is further performed to show the effectiveness of each component.

Related papers

Multi-Head Attention Driven Dynamic Visual-Semantic Embedding for Enhanced Image-Text Matching [0.8611782340880084]
This study proposes an innovative visual semantic embedding model, Multi-Headed Consensus-Aware Visual-Semantic Embedding (MH-CVSE) This model introduces a multi-head self-attention mechanism based on the consensus-aware visual semantic embedding model (CVSE) to capture information in multiple subspaces in parallel. In terms of loss function design, the MH-CVSE model adopts a dynamic weight adjustment strategy to dynamically adjust the weight according to the loss value itself.
arXiv Detail & Related papers (2024-12-26T11:46:22Z)
DSReLU: A Novel Dynamic Slope Function for Superior Model Training [2.2057562301812674]
The rationale behind this approach is to overcome limitations associated with traditional activation functions, such as ReLU. Evaluated on the Mini-ImageNet, CIFAR-100, and MIT-BIH datasets, our method demonstrated improvements in classification metrics and generalization capabilities.
arXiv Detail & Related papers (2024-08-17T10:01:30Z)
On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction [2.874893537471256]
This study evaluates the performance of classical tree-based models and advanced neural networks in protein-ligand binding affinity prediction. We show that combining 2D and 3D model strengths improves active learning outcomes beyond current state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-15T13:06:00Z)
Dynamic Feature Learning and Matching for Class-Incremental Learning [20.432575325147894]
Class-incremental learning (CIL) has emerged as a means to learn new classes without catastrophic forgetting of previous classes. We propose the Dynamic Feature Learning and Matching (DFLM) model in this paper. Our proposed model achieves significant performance improvements over existing methods.
arXiv Detail & Related papers (2024-05-14T12:17:19Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture [68.13678918660872]
We design a more capable parameter-sharing architecture based on matrix product operator (MPO) MPO decomposition can reorganize and factorize the information of a parameter matrix into two parts. Our architecture shares the central tensor across all layers for reducing the model size.
arXiv Detail & Related papers (2023-03-27T02:34:09Z)
Learning to Augment via Implicit Differentiation for Domain Generalization [107.9666735637355]
Domain generalization (DG) aims to overcome the problem by leveraging multiple source domains to learn a domain-generalizable model. In this paper, we propose a novel augmentation-based DG approach, dubbed AugLearn. AugLearn shows effectiveness on three standard DG benchmarks, PACS, Office-Home and Digits-DG.
arXiv Detail & Related papers (2022-10-25T18:51:51Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning [32.3217883750605]
We learn the dynamic kernels of a convolution network as a function of the task at hand, enabling faster generalization. We empirically show that our model improves performance on few-shot classification and detection tasks.
arXiv Detail & Related papers (2021-12-07T04:52:36Z)
Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples. As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.