Related papers: A Survey on Model Compression for Natural Language Processing

A Survey on Model Compression for Natural Language Processing

URL: http://arxiv.org/abs/2202.07105v1
Date: Tue, 15 Feb 2022 00:18:47 GMT
Title: A Survey on Model Compression for Natural Language Processing
Authors: Canwen Xu and Julian McAuley
Abstract summary: Transformer is preventing NLP from entering broader scenarios including edge and mobile computing. Efficient NLP research aims to comprehensively consider computation, time and carbon emission for the entire life-cycle of NLP.
Score: 13.949219077548687
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With recent developments in new architectures like Transformer and pretraining techniques, significant progress has been made in applications of natural language processing (NLP). However, the high energy cost and long inference delay of Transformer is preventing NLP from entering broader scenarios including edge and mobile computing. Efficient NLP research aims to comprehensively consider computation, time and carbon emission for the entire life-cycle of NLP, including data preparation, model training and inference. In this survey, we focus on the inference stage and review the current state of model compression for NLP, including the benchmarks, metrics and methodology. We outline the current obstacles and future research directions.

Related papers

Quantum Natural Language Processing: A Comprehensive Review of Models, Methods, and Applications [0.34284444670464664]
It is proposed to categorise QNLP models based on quantum computing principles, architecture, and computational approaches. This paper attempts to provide a survey on how quantum meets language by mapping state-of-the-art in this area.
arXiv Detail & Related papers (2025-04-14T06:09:26Z)
Natural Language Processing Methods for the Study of Protein-Ligand Interactions [8.165512093198934]
Recent advances in Natural Language Processing have ignited interest in developing effective methods for predicting protein-ligand interactions. In this review, we explain where and how such approaches have been applied in the recent literature and discuss useful mechanisms such as short-term memory, transformers, and attention. We conclude with a discussion of the current limitations of NLP methods for the study of PLIs as well as key challenges that need to be addressed in future work.
arXiv Detail & Related papers (2024-09-19T19:14:50Z)
A Survey on Transformers in NLP with Focus on Efficiency [2.7651063843287718]
This paper presents a commentary on the evolution of NLP and its applications with emphasis on their accuracy as-well-as efficiency. The goal of this survey is to determine how current NLP techniques contribute towards a sustainable society.
arXiv Detail & Related papers (2024-05-15T10:32:41Z)
Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning [0.0]
We propose a fine-tuning frame-work that leverages. Efficient Fine-Tuning (PEFT) techniques. We demonstrate that the proposed fine-tuning framework has the potential to improve code-text retrieval performance by tuning only 0.4% parameters at most.
arXiv Detail & Related papers (2024-05-07T08:50:25Z)
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models [79.34513906324727]
In this paper, we aim at parameter and efficient transfer learning (PCETL) for vision-language pre-trained models. We propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL.
arXiv Detail & Related papers (2023-09-04T09:34:33Z)
Latent Bottlenecked Attentive Neural Processes [71.18817592128207]
We present Latent Bottlenecked Attentive Neural Processes (LBANPs) LBANPs have a querying computational complexity independent of the number of context datapoints. We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
arXiv Detail & Related papers (2022-11-15T19:21:41Z)
A Kernel-Based View of Language Model Fine-Tuning [94.75146965041131]
We investigate whether the Neural Tangent Kernel (NTK) describes fine-tuning of pre-trained LMs. We show that formulating the downstream task as a masked word prediction problem through prompting often induces kernel-based dynamics during fine-tuning.
arXiv Detail & Related papers (2022-10-11T17:34:32Z)
Efficient Methods for Natural Language Processing: A Survey [76.34572727185896]
This survey synthesizes and relates current methods and findings in efficient NLP. We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.
arXiv Detail & Related papers (2022-08-31T20:32:35Z)
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey [67.82942975834924]
Large, pre-trained language models such as BERT have drastically changed the Natural Language Processing (NLP) field. We present a survey of recent work that uses these large language models to solve NLP tasks via pre-training then fine-tuning, prompting, or text generation approaches.
arXiv Detail & Related papers (2021-11-01T20:08:05Z)
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP [88.65488361532158]
dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks. Data augmentation methods have been explored as a means of improving data efficiency in NLP. We provide an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting.
arXiv Detail & Related papers (2021-06-14T15:27:22Z)
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures [0.0]
Natural Language Processing models have achieved phenomenal success in linguistic and semantic tasks. Recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes. Knowledge Retrievers have been built to extricate explicit data documents from a large corpus of databases with greater efficiency and accuracy.
arXiv Detail & Related papers (2021-03-23T22:38:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.