EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language
Processing
- URL: http://arxiv.org/abs/2205.00258v1
- Date: Sat, 30 Apr 2022 13:03:53 GMT
- Title: EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language
Processing
- Authors: Chengyu Wang, Minghui Qiu, Taolin Zhang, Tingting Liu, Lei Li, Jianing
Wang, Ming Wang, Jun Huang, Wei Lin
- Abstract summary: EasyNLP is designed to make it easy to build NLP applications.
It features knowledge-enhanced pre-training, knowledge distillation and few-shot learning.
EasyNLP has powered over ten business units within Alibaba Group.
- Score: 38.9428437204642
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The success of Pre-Trained Models (PTMs) has reshaped the development of
Natural Language Processing (NLP). Yet, it is not easy to obtain
high-performing models and deploy them online for industrial practitioners. To
bridge this gap, EasyNLP is designed to make it easy to build NLP applications,
which supports a comprehensive suite of NLP algorithms. It further features
knowledge-enhanced pre-training, knowledge distillation and few-shot learning
functionalities for large-scale PTMs, and provides a unified framework of model
training, inference and deployment for real-world applications. Currently,
EasyNLP has powered over ten business units within Alibaba Group and is
seamlessly integrated to the Platform of AI (PAI) products on Alibaba Cloud.
The source code of our EasyNLP toolkit is released at GitHub
(https://github.com/alibaba/EasyNLP).
Related papers
- CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models [59.91221728187576]
This paper introduces the CMU Linguistic Linguistic Backend, an open-source framework that simplifies model deployment and continuous human-in-the-loop fine-tuning of NLP models.
CMULAB enables users to leverage the power of multilingual models to quickly adapt and extend existing tools for speech recognition, OCR, translation, and syntactic analysis to new languages.
arXiv Detail & Related papers (2024-04-03T02:21:46Z) - VNLP: Turkish NLP Package [0.0]
VNLP is a state-of-the-art Natural Language Processing (NLP) package for the Turkish language.
It contains a wide variety of tools, ranging from the simplest tasks, such as sentence splitting and text normalization, to the more advanced ones, such as text and token classification models.
VNLP has an open-source GitHub repository, ReadtheDocs documentation, PyPi package for convenient installation, Python and command-line API.
arXiv Detail & Related papers (2024-03-02T20:46:56Z) - HugNLP: A Unified and Comprehensive Library for Natural Language
Processing [14.305751154503133]
We introduce HugNLP, a library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers.
HugNLP consists of a hierarchical structure including models, processors and applications that unifies the learning process of pre-trained language models (PLMs) on different NLP tasks.
arXiv Detail & Related papers (2023-02-28T03:38:26Z) - Number Entity Recognition [65.80137628972312]
Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed.
In this work, we attempt to tap this potential of state-of-the-art NLP models and transfer their ability to boost performance in related tasks.
Our proposed classification of numbers into entities helps NLP models perform well on several tasks, including a handcrafted Fill-In-The-Blank (FITB) task and on question answering using joint embeddings.
arXiv Detail & Related papers (2022-05-07T05:22:43Z) - STAMP 4 NLP -- An Agile Framework for Rapid Quality-Driven NLP
Applications Development [3.86574270083089]
We introduce STAMP 4 NLP as an iterative and incremental process model for developing NLP applications.
With STAMP 4 NLP, we merge software engineering principles with best practices from data science.
Due to our iterative-incremental approach, businesses can deploy an enhanced version of the prototype to their software environment after every iteration.
arXiv Detail & Related papers (2021-11-16T12:20:47Z) - FedNLP: A Research Platform for Federated Learning in Natural Language
Processing [55.01246123092445]
We present the FedNLP, a research platform for federated learning in NLP.
FedNLP supports various popular task formulations in NLP such as text classification, sequence tagging, question answering, seq2seq generation, and language modeling.
Preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets.
arXiv Detail & Related papers (2021-04-18T11:04:49Z) - A Data-Centric Framework for Composable NLP Workflows [109.51144493023533]
Empirical natural language processing systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components.
We establish a unified open-source framework to support fast development of such sophisticated NLP in a composable manner.
arXiv Detail & Related papers (2021-03-02T16:19:44Z) - EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform
for NLP Applications [65.87067607849757]
EasyTransfer is a platform to develop deep Transfer Learning algorithms for Natural Language Processing (NLP) applications.
EasyTransfer supports various NLP models in the ModelZoo, including mainstream PLMs and multi-modality models.
EasyTransfer is currently deployed at Alibaba to support a variety of business scenarios.
arXiv Detail & Related papers (2020-11-18T18:41:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.