RPLKG: Robust Prompt Learning with Knowledge Graph
- URL: http://arxiv.org/abs/2304.10805v1
- Date: Fri, 21 Apr 2023 08:22:58 GMT
- Title: RPLKG: Robust Prompt Learning with Knowledge Graph
- Authors: Yewon Kim, YongTaek Lim, Dokyung Yoon and KyungWoo Song
- Abstract summary: We propose a new method, robust prompt learning with knowledge graph (RPLKG)
Based on the knowledge graph, we automatically design diverse interpretable and meaningful prompt sets.
RPLKG shows a significant performance improvement compared to zero-shot learning.
- Score: 11.893917358053004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale pre-trained models have been known that they are transferable,
and they generalize well on the unseen dataset. Recently, multimodal
pre-trained models such as CLIP show significant performance improvement in
diverse experiments. However, when the labeled dataset is limited, the
generalization of a new dataset or domain is still challenging. To improve the
generalization performance on few-shot learning, there have been diverse
efforts, such as prompt learning and adapter. However, the current few-shot
adaptation methods are not interpretable, and they require a high computation
cost for adaptation. In this study, we propose a new method, robust prompt
learning with knowledge graph (RPLKG). Based on the knowledge graph, we
automatically design diverse interpretable and meaningful prompt sets. Our
model obtains cached embeddings of prompt sets after one forwarding from a
large pre-trained model. After that, model optimizes the prompt selection
processes with GumbelSoftmax. In this way, our model is trained using
relatively little memory and learning time. Also, RPLKG selects the optimal
interpretable prompt automatically, depending on the dataset. In summary, RPLKG
is i) interpretable, ii) requires small computation resources, and iii) easy to
incorporate prior human knowledge. To validate the RPLKG, we provide
comprehensive experimental results on few-shot learning, domain generalization
and new class generalization setting. RPLKG shows a significant performance
improvement compared to zero-shot learning and competitive performance against
several prompt learning methods using much lower resources.
Related papers
- Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling.
Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Towards Efficient Active Learning in NLP via Pretrained Representations [1.90365714903665]
Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.
We drastically expedite this process by using pretrained representations of LLMs within the active learning loop.
Our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive.
arXiv Detail & Related papers (2024-02-23T21:28:59Z) - Exploring Learning Complexity for Downstream Data Pruning [9.526877053855998]
We propose to treat the learning complexity (LC) as the scoring function for classification and regression tasks.
For the instruction fine-tuning of large language models, our method achieves state-of-the-art performance with stable convergence.
arXiv Detail & Related papers (2024-02-08T02:29:33Z) - Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in
Dense Encoders [63.28408887247742]
We study whether training procedures can be improved to yield better generalization capabilities in the resulting models.
We recommend a simple recipe for training dense encoders: Train on MSMARCO with parameter-efficient methods, such as LoRA, and opt for using in-batch negatives unless given well-constructed hard negatives.
arXiv Detail & Related papers (2023-11-16T10:42:58Z) - GistScore: Learning Better Representations for In-Context Example
Selection with Gist Bottlenecks [3.9638110494107095]
In-context Learning (ICL) is the ability of Large Language Models (LLMs) to perform new tasks when conditioned on prompts.
We propose Example Gisting, a novel approach for training example encoders through supervised fine-tuning.
We show that our fine-tuned models get state-of-the-art ICL performance with over 20% absolute gain over off-the-shelf retrievers.
arXiv Detail & Related papers (2023-11-16T06:28:05Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - On Measuring the Intrinsic Few-Shot Hardness of Datasets [49.37562545777455]
We show that few-shot hardness may be intrinsic to datasets, for a given pre-trained model.
We propose a simple and lightweight metric called "Spread" that captures the intuition that few-shot learning is made possible.
Our metric better accounts for few-shot hardness compared to existing notions of hardness, and is 8-100x faster to compute.
arXiv Detail & Related papers (2022-11-16T18:53:52Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Optimizing Active Learning for Low Annotation Budgets [6.753808772846254]
In deep learning, active learning is usually implemented as an iterative process in which successive deep models are updated via fine tuning.
We tackle this issue by using an approach inspired by transfer learning.
We introduce a novel acquisition function which exploits the iterative nature of AL process to select samples in a more robust fashion.
arXiv Detail & Related papers (2022-01-18T18:53:10Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic
Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants.
Recent advances in deep learning have enabled several approaches to successfully parse more complex queries.
We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.