The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
Architectures
- URL: http://arxiv.org/abs/2104.10640v3
- Date: Sat, 24 Apr 2021 17:31:46 GMT
- Title: The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
Architectures
- Authors: Sushant Singh and Ausif Mahmood
- Abstract summary: Natural Language Processing models have achieved phenomenal success in linguistic and semantic tasks.
Recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes.
Knowledge Retrievers have been built to extricate explicit data documents from a large corpus of databases with greater efficiency and accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, Natural Language Processing (NLP) models have achieved
phenomenal success in linguistic and semantic tasks like text classification,
machine translation, cognitive dialogue systems, information retrieval via
Natural Language Understanding (NLU), and Natural Language Generation (NLG).
This feat is primarily attributed due to the seminal Transformer architecture,
leading to designs such as BERT, GPT (I, II, III), etc. Although these
large-size models have achieved unprecedented performances, they come at high
computational costs. Consequently, some of the recent NLP architectures have
utilized concepts of transfer learning, pruning, quantization, and knowledge
distillation to achieve moderate model sizes while keeping nearly similar
performances as achieved by their predecessors. Additionally, to mitigate the
data size challenge raised by language models from a knowledge extraction
perspective, Knowledge Retrievers have been built to extricate explicit data
documents from a large corpus of databases with greater efficiency and
accuracy. Recent research has also focused on superior inference by providing
efficient attention to longer input sequences. In this paper, we summarize and
examine the current state-of-the-art (SOTA) NLP models that have been employed
for numerous NLP tasks for optimal performance and efficiency. We provide a
detailed understanding and functioning of the different architectures, a
taxonomy of NLP designs, comparative evaluations, and future directions in NLP.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.