EXnet: Efficient In-context Learning for Data-less Text classification
- URL: http://arxiv.org/abs/2305.14622v1
- Date: Wed, 24 May 2023 01:40:57 GMT
- Title: EXnet: Efficient In-context Learning for Data-less Text classification
- Authors: Debaditya Shome, Kuldeep Yadav
- Abstract summary: We present EXnet, a model specifically designed to perform in-context learning without limitations on the number of examples.
We argue that in-context learning is an effective method to increase task accuracy, and providing examples facilitates cross-task generalization.
With extensive experiments, we show that even our smallest model (15M parameters) generalizes to several unseen classification tasks and domains.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large pre-trained language models (PLMs) have made significant progress in
encoding world knowledge and spawned a new set of learning paradigms including
zero-shot, few-shot, and in-context learning. Many language tasks can be
modeled as a set of prompts (for example, is this text about geography?) and
language models can provide binary answers, i.e., Yes or No. There is evidence
to suggest that the next-word prediction used by many PLMs does not align well
with zero-shot paradigms. Therefore, PLMs are fine-tuned as a
question-answering system. In-context learning extends zero-shot learning by
incorporating prompts and examples, resulting in increased task accuracy. Our
paper presents EXnet, a model specifically designed to perform in-context
learning without any limitations on the number of examples. We argue that
in-context learning is an effective method to increase task accuracy, and
providing examples facilitates cross-task generalization, especially when it
comes to text classification tasks. With extensive experiments, we show that
even our smallest model (15M parameters) generalizes to several unseen
classification tasks and domains.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - Structured Prompting: Scaling In-Context Learning to 1,000 Examples [78.41281805608081]
We introduce structured prompting that breaks the length limit and scales in-context learning to thousands of examples.
Specifically, demonstration examples are separately encoded with well-designed position embeddings, and then they are jointly attended by the test example using a rescaled attention mechanism.
arXiv Detail & Related papers (2022-12-13T16:31:21Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - Prompt-Learning for Fine-Grained Entity Typing [40.983849729537795]
We investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
We propose a self-supervised strategy that carries out distribution-level optimization in prompt-learning to automatically summarize the information of entity types.
arXiv Detail & Related papers (2021-08-24T09:39:35Z) - Reordering Examples Helps during Priming-based Few-Shot Learning [6.579039107070663]
We show that PERO can learn to generalize efficiently using as few as 10 examples.
We demonstrate the effectiveness of the proposed method on the tasks of sentiment classification, natural language inference and fact retrieval.
arXiv Detail & Related papers (2021-06-03T11:02:36Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Language Models are Few-Shot Learners [61.36677350504291]
We show that scaling up language models greatly improves task-agnostic, few-shot performance.
We train GPT-3, an autoregressive language model with 175 billion parameters, and test its performance in the few-shot setting.
GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks.
arXiv Detail & Related papers (2020-05-28T17:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.