Smooth Sailing: Improving Active Learning for Pre-trained Language
Models with Representation Smoothness Analysis
- URL: http://arxiv.org/abs/2212.11680v2
- Date: Mon, 23 Oct 2023 14:05:49 GMT
- Title: Smooth Sailing: Improving Active Learning for Pre-trained Language
Models with Representation Smoothness Analysis
- Authors: Josip Juki\'c and Jan \v{S}najder
- Abstract summary: Active learning (AL) methods aim to reduce label complexity in supervised learning.
We propose an early stopping technique that does not require a validation set.
We find that task adaptation improves AL, whereas standard short fine-tuning in AL does not provide improvements over random sampling.
- Score: 3.490038106567192
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developed to alleviate prohibitive labeling costs, active learning (AL)
methods aim to reduce label complexity in supervised learning. While recent
work has demonstrated the benefit of using AL in combination with large
pre-trained language models (PLMs), it has often overlooked the practical
challenges that hinder the effectiveness of AL. We address these challenges by
leveraging representation smoothness analysis to ensure AL is feasible, that
is, both effective and practicable. Firstly, we propose an early stopping
technique that does not require a validation set -- often unavailable in
realistic AL conditions -- and observe significant improvements over random
sampling across multiple datasets and AL methods. Further, we find that task
adaptation improves AL, whereas standard short fine-tuning in AL does not
provide improvements over random sampling. Our work demonstrates the usefulness
of representation smoothness analysis for AL and introduces an AL stopping
criterion that reduces label complexity.
Related papers
- Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Navigating the Pitfalls of Active Learning Evaluation: A Systematic
Framework for Meaningful Performance Assessment [3.3064235071867856]
Active Learning (AL) aims to reduce the labeling burden by interactively selecting the most informative samples from a pool of unlabeled data.
Some studies have questioned the effectiveness of AL compared to emerging paradigms such as semi-supervised (Semi-SL) and self-supervised learning (Self-SL)
arXiv Detail & Related papers (2023-01-25T15:07:44Z) - Pareto Optimization for Active Learning under Out-of-Distribution Data
Scenarios [79.02009938011447]
We propose a sampling scheme, which selects optimal subsets of unlabeled samples with fixed batch size from the unlabeled data pool.
Experimental results show its effectiveness on both classical Machine Learning (ML) and Deep Learning (DL) tasks.
arXiv Detail & Related papers (2022-07-04T04:11:44Z) - Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem.
We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority.
Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z) - Smoothing Advantage Learning [20.760987175553645]
We propose a simple variant of Advantage learning (AL) named smoothing advantage learning (SAL)
The proposed value smoothing technique not only helps to stabilize the training procedure of AL by controlling the trade-off between convergence rate and the upper bound of the approximation errors, but is beneficial to increase the action gap between the optimal and sub-optimal action value as well.
arXiv Detail & Related papers (2022-03-20T03:52:32Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Effective Evaluation of Deep Active Learning on Image Classification
Tasks [10.27095298129151]
We present a unified re-implementation of state-of-the-art active learning algorithms in the context of image classification.
On the positive side, we show that AL techniques are 2x to 4x more label-efficient compared to RS with the use of data augmentation.
arXiv Detail & Related papers (2021-06-16T23:29:39Z) - Relieving the Plateau: Active Semi-Supervised Learning for a Better
Landscape [2.3046646540823916]
Semi-supervised learning (SSL) leverages unlabeled data that are more accessible than their labeled counterparts.
Active learning (AL) selects unlabeled instances to be annotated by a human-in-the-loop in hopes of better performance with less labeled data.
We propose convergence rate control (CRC), an AL algorithm that selects unlabeled data to improve the problem conditioning upon inclusion to the labeled set.
arXiv Detail & Related papers (2021-04-08T06:03:59Z) - Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost.
We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags.
Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.