Bayesian Active Summarization
- URL: http://arxiv.org/abs/2110.04480v1
- Date: Sat, 9 Oct 2021 06:51:16 GMT
- Title: Bayesian Active Summarization
- Authors: Alexios Gidiotis and Grigorios Tsoumakas
- Abstract summary: We introduce Bayesian Active Summarization (BAS) as a method of combining active learning methods with state-of-the-art summarization models.
Our findings suggest that BAS achieves better and more robust performance, compared to random selection.
- Score: 3.1423034006764965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian Active Learning has had significant impact to various NLP problems,
but nevertheless it's application to text summarization has been explored very
little. We introduce Bayesian Active Summarization (BAS), as a method of
combining active learning methods with state-of-the-art summarization models.
Our findings suggest that BAS achieves better and more robust performance,
compared to random selection, particularly for small and very small data
annotation budgets. Using BAS we showcase it is possible to leverage large
summarization models to effectively solve real-world problems with very limited
annotated data.
Related papers
- Table Detection with Active Learning [1.9881456274482427]
Active learning is a promising solution to minimize annotation costs by selecting the most informative samples.<n>Our approach ensures the selection of representative examples that improve model generalization.<n>Our results demonstrate that AL-based example selection significantly outperforms random sampling.
arXiv Detail & Related papers (2025-09-24T11:22:30Z) - Optimizing Active Learning in Vision-Language Models via Parameter-Efficient Uncertainty Calibration [6.7181844004432385]
We introduce a novel parameter-efficient learning methodology that incorporates uncertainty calibration loss within the Active Learning framework.<n>We demonstrate that our solution can match and exceed the performance of complex feature-based sampling techniques.
arXiv Detail & Related papers (2025-07-29T06:08:28Z) - Making Better Use of Unlabelled Data in Bayesian Active Learning [19.050266270699368]
We propose a framework for semi-supervised Bayesian active learning.
We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data.
arXiv Detail & Related papers (2024-04-26T08:41:55Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Active Pointly-Supervised Instance Segmentation [106.38955769817747]
We present an economic active learning setting, named active pointly-supervised instance segmentation (APIS)
APIS starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
The model developed with these strategies yields consistent performance gain on the challenging MS-COCO dataset.
arXiv Detail & Related papers (2022-07-23T11:25:24Z) - Mitigating Sampling Bias and Improving Robustness in Active Learning [13.994967246046008]
We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting.
We propose an unbiased query strategy that selects informative data samples of diverse feature representations.
We empirically demonstrate our proposed methods reduce sampling bias, achieve state-of-the-art accuracy and model calibration in an active learning setup.
arXiv Detail & Related papers (2021-09-13T20:58:40Z) - Deep Bayesian Active Learning, A Brief Survey on Recent Advances [6.345523830122166]
Active learning starts training the model with a small size of labeled data.
Deep learning methods are not capable of either representing or manipulating model uncertainty.
Deep Bayesian active learning frameworks provide practical consideration in the model.
arXiv Detail & Related papers (2020-12-15T02:06:07Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.