Related papers: HugNLP: A Unified and Comprehensive Library for Natural Language Processing

HugNLP: A Unified and Comprehensive Library for Natural Language Processing

URL: http://arxiv.org/abs/2302.14286v1
Date: Tue, 28 Feb 2023 03:38:26 GMT
Title: HugNLP: A Unified and Comprehensive Library for Natural Language Processing
Authors: Jianing Wang, Nuo Chen, Qiushi Sun, Wenkang Huang, Chengyu Wang, Ming Gao
Abstract summary: We introduce HugNLP, a library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers. HugNLP consists of a hierarchical structure including models, processors and applications that unifies the learning process of pre-trained language models (PLMs) on different NLP tasks.
Score: 14.305751154503133
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we introduce HugNLP, a unified and comprehensive library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers, which is designed for NLP researchers to easily utilize off-the-shelf algorithms and develop novel methods with user-defined models and tasks in real-world scenarios. HugNLP consists of a hierarchical structure including models, processors and applications that unifies the learning process of pre-trained language models (PLMs) on different NLP tasks. Additionally, we present some featured NLP applications to show the effectiveness of HugNLP, such as knowledge-enhanced PLMs, universal information extraction, low-resource mining, and code understanding and generation, etc. The source code will be released on GitHub (https://github.com/wjn1996/HugNLP).

Related papers

Langformers: Unified NLP Pipelines for Language Models [3.690904966341072]
Langformers is an open-source Python library designed to streamline NLP pipelines. It integrates conversational AI, pretraining, text classification, sentence embedding/reranking, data labelling, semantic search, and knowledge distillation into a cohesive API.
arXiv Detail & Related papers (2025-04-12T10:17:49Z)
CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models [59.91221728187576]
This paper introduces the CMU Linguistic Linguistic Backend, an open-source framework that simplifies model deployment and continuous human-in-the-loop fine-tuning of NLP models. CMULAB enables users to leverage the power of multilingual models to quickly adapt and extend existing tools for speech recognition, OCR, translation, and syntactic analysis to new languages.
arXiv Detail & Related papers (2024-04-03T02:21:46Z)
VNLP: Turkish NLP Package [0.0]
VNLP is a state-of-the-art Natural Language Processing (NLP) package for the Turkish language. It contains a wide variety of tools, ranging from the simplest tasks, such as sentence splitting and text normalization, to the more advanced ones, such as text and token classification models. VNLP has an open-source GitHub repository, ReadtheDocs documentation, PyPi package for convenient installation, Python and command-line API.
arXiv Detail & Related papers (2024-03-02T20:46:56Z)
XNLP: An Interactive Demonstration System for Universal Structured NLP [90.42606755782786]
We propose an advanced XNLP demonstration platform, where we propose leveraging LLM to achieve universal XNLP, with one model for all with high generalizability. Overall, our system advances in multiple aspects, including universal XNLP modeling, high performance, interpretability, scalability, interactivity, providing a unified platform for exploring diverse XNLP tasks in the community.
arXiv Detail & Related papers (2023-08-03T16:13:05Z)
Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area. Deep learning requires many labeled data and is less generalizable across domains. Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z)
LaoPLM: Pre-trained Language Models for Lao [3.2146309563776416]
Pre-trained language models (PLMs) can capture different levels of concepts in context and hence generate universal language representations. Although PTMs have been widely used in most NLP applications, it is under-represented in Lao NLP research. We construct a text classification dataset to alleviate the resource-scare situation of the Lao language. We present the first transformer-based PTMs for Lao with four versions: BERT-small, BERT-base, ELECTRA-small and ELECTRA-base, and evaluate it over two downstream tasks: part-of-speech tagging and text classification.
arXiv Detail & Related papers (2021-10-12T11:13:07Z)
FedNLP: A Research Platform for Federated Learning in Natural Language Processing [55.01246123092445]
We present the FedNLP, a research platform for federated learning in NLP. FedNLP supports various popular task formulations in NLP such as text classification, sequence tagging, question answering, seq2seq generation, and language modeling. Preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets.
arXiv Detail & Related papers (2021-04-18T11:04:49Z)
A Data-Centric Framework for Composable NLP Workflows [109.51144493023533]
Empirical natural language processing systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components. We establish a unified open-source framework to support fast development of such sophisticated NLP in a composable manner.
arXiv Detail & Related papers (2021-03-02T16:19:44Z)
Low-Resource Adaptation of Neural NLP Models [0.30458514384586405]
This thesis investigates methods for dealing with low-resource scenarios in information extraction and natural language understanding. We develop and adapt neural NLP models to explore a number of research questions concerning NLP tasks with minimal or no training data.
arXiv Detail & Related papers (2020-11-09T12:13:55Z)
MPLP: Learning a Message Passing Learning Protocol [63.948465205530916]
We present a novel method for learning the weights of an artificial neural network - a Message Passing Learning Protocol (MPLP) We abstract every operations occurring in ANNs as independent agents. Each agent is responsible for ingesting incoming multidimensional messages from other agents, updating its internal state, and generating multidimensional messages to be passed on to neighbouring agents.
arXiv Detail & Related papers (2020-07-02T09:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.