Related papers: Mitigating Data Scarcity for Large Language Models

Mitigating Data Scarcity for Large Language Models

URL: http://arxiv.org/abs/2302.01806v1
Date: Fri, 3 Feb 2023 15:17:53 GMT
Title: Mitigating Data Scarcity for Large Language Models
Authors: Hoang Van
Abstract summary: In recent years, pretrained neural language models (PNLMs) have taken the field of natural language processing by storm. Data scarcity are commonly found in specialized domains, such as medical, or in low-resource languages that are underexplored by AI research. In this dissertation, we focus on mitigating data scarcity using data augmentation and neural ensemble learning techniques.
Score: 7.259279261659759
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, pretrained neural language models (PNLMs) have taken the field of natural language processing by storm, achieving new benchmarks and state-of-the-art performances. These models often rely heavily on annotated data, which may not always be available. Data scarcity are commonly found in specialized domains, such as medical, or in low-resource languages that are underexplored by AI research. In this dissertation, we focus on mitigating data scarcity using data augmentation and neural ensemble learning techniques for neural language models. In both research directions, we implement neural network algorithms and evaluate their impact on assisting neural language models in downstream NLP tasks. Specifically, for data augmentation, we explore two techniques: 1) creating positive training data by moving an answer span around its original context and 2) using text simplification techniques to introduce a variety of writing styles to the original training data. Our results indicate that these simple and effective solutions improve the performance of neural language models considerably in low-resource NLP domains and tasks. For neural ensemble learning, we use a multilabel neural classifier to select the best prediction outcome from a variety of individual pretrained neural language models trained for a low-resource medical text simplification task.

Related papers

In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
Deep Learning Models to Study Sentence Comprehension in the Human Brain [0.1503974529275767]
Recent artificial neural networks that process natural language achieve unprecedented performance in tasks requiring sentence-level understanding. We review works that compare these artificial language models with human brain activity and we assess the extent to which this approach has improved our understanding of the neural processes involved in natural language comprehension.
arXiv Detail & Related papers (2023-01-16T10:31:25Z)
Leveraging Graph-based Cross-modal Information Fusion for Neural Sign Language Translation [46.825957917649795]
Sign Language (SL), as the mother tongue of the deaf community, is a special visual language that most hearing people cannot understand. We propose a novel neural SLT model with multi-modal feature fusion based on the dynamic graph. We are the first to introduce graph neural networks, for fusing multi-modal information, into neural sign language translation models.
arXiv Detail & Related papers (2022-11-01T15:26:22Z)
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook. We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words. We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z)
Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models. In detail, we first train neural language models with a novel dependency modeling objective. We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z)
Improving Classifier Training Efficiency for Automatic Cyberbullying Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods. We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments. The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks. We explore the diversity of the neurons within the hidden layer during the learning process. We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
Low-Resource Language Modelling of South African Languages [6.805575417034369]
We evaluate the performance of open-vocabulary language models on low-resource South African languages. We evaluate different variants of n-gram models, feedforward neural networks, recurrent neural networks (RNNs) and Transformers on small-scale datasets. Overall, well-regularized RNNs give the best performance across two isiZulu and one Sepedi datasets.
arXiv Detail & Related papers (2021-04-01T21:27:27Z)
A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios [30.391291221959545]
Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing.
arXiv Detail & Related papers (2020-10-23T11:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.