HugNLP: A Unified and Comprehensive Library for Natural Language
Processing
- URL: http://arxiv.org/abs/2302.14286v1
- Date: Tue, 28 Feb 2023 03:38:26 GMT
- Title: HugNLP: A Unified and Comprehensive Library for Natural Language
Processing
- Authors: Jianing Wang, Nuo Chen, Qiushi Sun, Wenkang Huang, Chengyu Wang, Ming
Gao
- Abstract summary: We introduce HugNLP, a library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers.
HugNLP consists of a hierarchical structure including models, processors and applications that unifies the learning process of pre-trained language models (PLMs) on different NLP tasks.
- Score: 14.305751154503133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce HugNLP, a unified and comprehensive library for
natural language processing (NLP) with the prevalent backend of HuggingFace
Transformers, which is designed for NLP researchers to easily utilize
off-the-shelf algorithms and develop novel methods with user-defined models and
tasks in real-world scenarios. HugNLP consists of a hierarchical structure
including models, processors and applications that unifies the learning process
of pre-trained language models (PLMs) on different NLP tasks. Additionally, we
present some featured NLP applications to show the effectiveness of HugNLP,
such as knowledge-enhanced PLMs, universal information extraction, low-resource
mining, and code understanding and generation, etc. The source code will be
released on GitHub (https://github.com/wjn1996/HugNLP).
Related papers
- CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models [59.91221728187576]
This paper introduces the CMU Linguistic Linguistic Backend, an open-source framework that simplifies model deployment and continuous human-in-the-loop fine-tuning of NLP models.
CMULAB enables users to leverage the power of multilingual models to quickly adapt and extend existing tools for speech recognition, OCR, translation, and syntactic analysis to new languages.
arXiv Detail & Related papers (2024-04-03T02:21:46Z) - VNLP: Turkish NLP Package [0.0]
VNLP is a state-of-the-art Natural Language Processing (NLP) package for the Turkish language.
It contains a wide variety of tools, ranging from the simplest tasks, such as sentence splitting and text normalization, to the more advanced ones, such as text and token classification models.
VNLP has an open-source GitHub repository, ReadtheDocs documentation, PyPi package for convenient installation, Python and command-line API.
arXiv Detail & Related papers (2024-03-02T20:46:56Z) - XNLP: An Interactive Demonstration System for Universal Structured NLP [90.42606755782786]
We propose an advanced XNLP demonstration platform, where we propose leveraging LLM to achieve universal XNLP, with one model for all with high generalizability.
Overall, our system advances in multiple aspects, including universal XNLP modeling, high performance, interpretability, scalability, interactivity, providing a unified platform for exploring diverse XNLP tasks in the community.
arXiv Detail & Related papers (2023-08-03T16:13:05Z) - Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area.
Deep learning requires many labeled data and is less generalizable across domains.
Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z) - LaoPLM: Pre-trained Language Models for Lao [3.2146309563776416]
Pre-trained language models (PLMs) can capture different levels of concepts in context and hence generate universal language representations.
Although PTMs have been widely used in most NLP applications, it is under-represented in Lao NLP research.
We construct a text classification dataset to alleviate the resource-scare situation of the Lao language.
We present the first transformer-based PTMs for Lao with four versions: BERT-small, BERT-base, ELECTRA-small and ELECTRA-base, and evaluate it over two downstream tasks: part-of-speech tagging and text classification.
arXiv Detail & Related papers (2021-10-12T11:13:07Z) - FedNLP: A Research Platform for Federated Learning in Natural Language
Processing [55.01246123092445]
We present the FedNLP, a research platform for federated learning in NLP.
FedNLP supports various popular task formulations in NLP such as text classification, sequence tagging, question answering, seq2seq generation, and language modeling.
Preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets.
arXiv Detail & Related papers (2021-04-18T11:04:49Z) - A Data-Centric Framework for Composable NLP Workflows [109.51144493023533]
Empirical natural language processing systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components.
We establish a unified open-source framework to support fast development of such sophisticated NLP in a composable manner.
arXiv Detail & Related papers (2021-03-02T16:19:44Z) - Low-Resource Adaptation of Neural NLP Models [0.30458514384586405]
This thesis investigates methods for dealing with low-resource scenarios in information extraction and natural language understanding.
We develop and adapt neural NLP models to explore a number of research questions concerning NLP tasks with minimal or no training data.
arXiv Detail & Related papers (2020-11-09T12:13:55Z) - MPLP: Learning a Message Passing Learning Protocol [63.948465205530916]
We present a novel method for learning the weights of an artificial neural network - a Message Passing Learning Protocol (MPLP)
We abstract every operations occurring in ANNs as independent agents.
Each agent is responsible for ingesting incoming multidimensional messages from other agents, updating its internal state, and generating multidimensional messages to be passed on to neighbouring agents.
arXiv Detail & Related papers (2020-07-02T09:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.