Paradigm Shift in Natural Language Processing
- URL: http://arxiv.org/abs/2109.12575v1
- Date: Sun, 26 Sep 2021 11:55:23 GMT
- Title: Paradigm Shift in Natural Language Processing
- Authors: Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
- Abstract summary: In the era of deep learning, modeling for most NLP tasks has converged to several mainstream paradigms.
Recent years have observed a rising trend of Paradigm Shift, which is solving one NLP task by reformulating it as another one.
Some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks.
- Score: 66.62609175829816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the era of deep learning, modeling for most NLP tasks has converged to
several mainstream paradigms. For example, we usually adopt the sequence
labeling paradigm to solve a bundle of tasks such as POS-tagging, NER,
Chunking, and adopt the classification paradigm to solve tasks like sentiment
analysis. With the rapid progress of pre-trained language models, recent years
have observed a rising trend of Paradigm Shift, which is solving one NLP task
by reformulating it as another one. Paradigm shift has achieved great success
on many tasks, becoming a promising way to improve model performance. Moreover,
some of these paradigms have shown great potential to unify a large number of
NLP tasks, making it possible to build a single model to handle diverse tasks.
In this paper, we review such phenomenon of paradigm shifts in recent years,
highlighting several paradigms that have the potential to solve different NLP
tasks.
Related papers
- Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - Towards Few-Shot Adaptation of Foundation Models via Multitask
Finetuning [20.727482935029375]
Foundation models have emerged as a powerful tool for many AI problems.
In this paper, we study the theoretical justification of a multitask finetuning approach.
We present results affirming our task selection algorithm adeptly chooses related finetuning tasks, providing advantages to the model performance on target tasks.
arXiv Detail & Related papers (2024-02-22T23:29:42Z) - Meta-training with Demonstration Retrieval for Efficient Few-shot
Learning [11.723856248352007]
Large language models show impressive results on few-shot NLP tasks.
These models are memory and computation-intensive.
We propose meta-training with demonstration retrieval.
arXiv Detail & Related papers (2023-06-30T20:16:22Z) - Deep Graph Reprogramming [112.34663053130073]
"Deep graph reprogramming" is a model reusing task tailored for graph neural networks (GNNs)
We propose an innovative Data Reprogramming paradigm alongside a Model Reprogramming paradigm.
arXiv Detail & Related papers (2023-04-28T02:04:29Z) - Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for
Downstream Tasks [55.431048995662714]
We create a small model for a new task from the pruned models of similar tasks.
We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task.
We develop a simple but effective ''Meta-Vote Pruning (MVP)'' method that significantly reduces the pruning iterations for a new task.
arXiv Detail & Related papers (2023-01-27T06:49:47Z) - Multi-task Active Learning for Pre-trained Transformer-based Models [22.228551277598804]
Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations.
This technique requires annotating the same text with multiple annotation schemes which may be costly and laborious.
Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples.
arXiv Detail & Related papers (2022-08-10T14:54:13Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z) - On Steering Multi-Annotations per Sample for Multi-Task Learning [79.98259057711044]
The study of multi-task learning has drawn great attention from the community.
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
Previous works attempt to modify the gradients from different tasks. Yet these methods give a subjective assumption of the relationship between tasks, and the modified gradient may be less accurate.
In this paper, we introduce Task Allocation(STA), a mechanism that addresses this issue by a task allocation approach, in which each sample is randomly allocated a subset of tasks.
For further progress, we propose Interleaved Task Allocation(ISTA) to iteratively allocate all
arXiv Detail & Related papers (2022-03-06T11:57:18Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - Lifelong Learning of Few-shot Learners across NLP Tasks [45.273018249235705]
We study the challenge of lifelong learning to few-shot learn over a sequence of diverse NLP tasks.
We propose a continual meta-learning approach which learns to generate adapter weights from a few examples.
We demonstrate our approach preserves model performance over training tasks and leads to positive knowledge transfer when the future tasks are learned.
arXiv Detail & Related papers (2021-04-18T10:41:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.