Investigating the Effectiveness of Representations Based on Pretrained
Transformer-based Language Models in Active Learning for Labelling Text
Datasets
- URL: http://arxiv.org/abs/2004.13138v1
- Date: Tue, 21 Apr 2020 02:37:44 GMT
- Title: Investigating the Effectiveness of Representations Based on Pretrained
Transformer-based Language Models in Active Learning for Labelling Text
Datasets
- Authors: Jinghui Lu and Brian MacNamee
- Abstract summary: The representation mechanism used to represent text documents when performing active learning has a significant influence on how effective the process will be.
This paper describes a comprehensive evaluation of the effectiveness of representations based on pre-trained neural network models for active learning.
Our experiments show that the limited label information acquired in active learning can not only be used for training a classifier but can also adaptively improve the embeddings generated by the BERT-like language models as well.
- Score: 4.7718339202518685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning has been shown to be an effective way to alleviate some of
the effort required in utilising large collections of unlabelled data for
machine learning tasks without needing to fully label them. The representation
mechanism used to represent text documents when performing active learning,
however, has a significant influence on how effective the process will be.
While simple vector representations such as bag-of-words and embedding-based
representations based on techniques such as word2vec have been shown to be an
effective way to represent documents during active learning, the emergence of
representation mechanisms based on the pre-trained transformer-based neural
network models popular in natural language processing research (e.g. BERT)
offer a promising, and as yet not fully explored, alternative. This paper
describes a comprehensive evaluation of the effectiveness of representations
based on pre-trained transformer-based language models for active learning.
This evaluation shows that transformer-based models, especially BERT-like
models, that have not yet been widely used in active learning, achieve a
significant improvement over more commonly used vector representations like
bag-of-words or other classical word embeddings like word2vec. This paper also
investigates the effectiveness of representations based on variants of BERT
such as Roberta, Albert as well as comparing the effectiveness of the [CLS]
token representation and the aggregated representation that can be generated
using BERT-like models. Finally, we propose an approach Adaptive Tuning Active
Learning. Our experiments show that the limited label information acquired in
active learning can not only be used for training a classifier but can also
adaptively improve the embeddings generated by the BERT-like language models as
well.
Related papers
- Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models [3.546617486894182]
We introduce HAST, a new and effective self-training strategy, which is evaluated on four text classification benchmarks.
Results show that it outperforms the reproduced self-training approaches and reaches classification results comparable to previous experiments for three out of four datasets.
arXiv Detail & Related papers (2024-06-13T15:06:11Z) - Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling [0.0]
This paper presents a novel hybrid approach that synergizes unsupervised and supervised learning to improve the accuracy of NLP task modeling.
Our methodology integrates an unsupervised module that learns representations from unlabeled corpora and a supervised module that leverages these representations to enhance task-specific models.
By synergizing techniques, our hybrid approach achieves SOTA results on benchmark datasets, paving the way for more data-efficient and robust NLP systems.
arXiv Detail & Related papers (2024-06-03T08:31:35Z) - DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning [75.68193159293425]
In-context learning (ICL) allows transformer-based language models to learn a specific task with a few "task demonstrations" without updating their parameters.
We propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL.
We experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.
arXiv Detail & Related papers (2024-05-22T15:52:52Z) - Towards Efficient Active Learning in NLP via Pretrained Representations [1.90365714903665]
Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.
We drastically expedite this process by using pretrained representations of LLMs within the active learning loop.
Our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive.
arXiv Detail & Related papers (2024-02-23T21:28:59Z) - Iterative Mask Filling: An Effective Text Augmentation Method Using
Masked Language Modeling [0.0]
We propose a novel text augmentation method that leverages the Fill-Mask feature of the transformer-based BERT model.
Our method involves iteratively masking words in a sentence and replacing them with language model predictions.
Experimental results show that our proposed method significantly improves performance, especially on topic classification datasets.
arXiv Detail & Related papers (2024-01-03T16:47:13Z) - Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition
in Conversations [0.7874708385247353]
We propose to combine the two approaches to perform Emotion Recognition in Conversations (ERC)
We feed utterances and their previous conversational turns to a pre-trained RoBERTa, obtaining contextual embedding utterance representations.
We validate our approach on the widely used DailyDialog ERC benchmark dataset.
arXiv Detail & Related papers (2023-09-08T12:26:01Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods.
We conducted three types of experiments -- monolingual, multilingual, and cross-lingual.
The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Visual Transformer for Task-aware Active Learning [49.903358393660724]
We present a novel pipeline for pool-based Active Learning.
Our method exploits accessible unlabelled examples during training to estimate their co-relation with the labelled examples.
Visual Transformer models non-local visual concept dependency between labelled and unlabelled examples.
arXiv Detail & Related papers (2021-06-07T17:13:59Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.