Related papers: ActiveGLAE: A Benchmark for Deep Active Learning with Transformers

ActiveGLAE: A Benchmark for Deep Active Learning with Transformers

URL: http://arxiv.org/abs/2306.10087v1
Date: Fri, 16 Jun 2023 13:07:29 GMT
Title: ActiveGLAE: A Benchmark for Deep Active Learning with Transformers
Authors: Lukas Rauch, Matthias A{\ss}enmacher, Denis Huseljic, Moritz Wirth, Bernd Bischl, Bernhard Sick
Abstract summary: Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most. There is currently no standardized evaluation protocol for transformer-based language models in the field of DAL. We propose the ActiveGLAE benchmark, a comprehensive collection of data sets and evaluation guidelines for assessing DAL.
Score: 5.326702806697265
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most. Despite extensive research, there is currently no standardized evaluation protocol for transformer-based language models in the field of DAL. Diverse experimental settings lead to difficulties in comparing research and deriving recommendations for practitioners. To tackle this challenge, we propose the ActiveGLAE benchmark, a comprehensive collection of data sets and evaluation guidelines for assessing DAL. Our benchmark aims to facilitate and streamline the evaluation process of novel DAL strategies. Additionally, we provide an extensive overview of current practice in DAL with transformer-based language models. We identify three key challenges - data set selection, model training, and DAL settings - that pose difficulties in comparing query strategies. We establish baseline results through an extensive set of experiments as a reference point for evaluating future work. Based on our findings, we provide guidelines for researchers and practitioners.

Related papers

Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons [9.954960702259918]
This paper introduces Themis, a fine-tuned large language model (LLMs) judge that delivers context-aware evaluations. We provide a comprehensive overview of the development pipeline for Themis, highlighting its scenario-dependent evaluation prompts. We introduce two human-labeled benchmarks for meta-evaluation, demonstrating that Themis can achieve high alignment with human preferences in an economical manner.
arXiv Detail & Related papers (2025-02-05T08:35:55Z)
How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples. We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics. When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z)
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain. This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation. We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z)
ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision. This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline. Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z)
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection? [11.269007806012931]
The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics. Our paper presents a critical analysis of the status quo in TAD, revealing the misleading track of current research.
arXiv Detail & Related papers (2024-05-04T14:43:31Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)
Benchmarking of Query Strategies: Towards Future Deep Active Learning [0.0]
We benchmark query strategies for deep actice learning(DAL) DAL reduces annotation costs by annotating only high-quality samples selected by query strategies.
arXiv Detail & Related papers (2023-12-10T04:17:16Z)
Sample Efficient Preference Alignment in LLMs via Active Exploration [63.84454768573154]
We take advantage of the fact that one can often choose contexts at which to obtain human feedback to most efficiently identify a good policy. We propose an active exploration algorithm to efficiently select the data and provide theoretical proof that it has a worst-case regret bound. Our method outperforms the baselines with limited samples of human preferences on several language models and four real-world datasets.
arXiv Detail & Related papers (2023-12-01T00:54:02Z)
Mean-AP Guided Reinforced Active Learning for Object Detection [31.304039641225504]
This paper introduces Mean-AP Guided Reinforced Active Learning for Object Detection (MGRAL) MGRAL is a novel approach that leverages the concept of expected model output changes as informativeness for deep detection networks. Our approach demonstrates strong performance, establishing a new paradigm in reinforcement learning-based active learning for object detection.
arXiv Detail & Related papers (2023-10-12T14:59:22Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
ALE: A Simulation-Based Active Learning Evaluation Framework for the Parameter-Driven Comparison of Query Strategies for NLP [3.024761040393842]
Active Learning (AL) proposes promising data points to annotators they annotate next instead of a subsequent or random sample. This method is supposed to save annotation effort while maintaining model performance. We introduce a reproducible active learning evaluation framework for the comparative evaluation of AL strategies in NLP.
arXiv Detail & Related papers (2023-08-01T10:42:11Z)
Re-Benchmarking Pool-Based Active Learning for Binary Classification [27.034593234956713]
Active learning is a paradigm that significantly enhances the performance of machine learning models when acquiring labeled data. While several benchmarks exist for evaluating active learning strategies, their findings exhibit some misalignment. This discrepancy motivates us to develop a transparent and reproducible benchmark for the community.
arXiv Detail & Related papers (2023-06-15T08:47:50Z)
Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction [63.70885228396077]
We propose a novel model to transfer opinions knowledge from resource-rich review sentiment classification datasets to low-resource task TOWE. Our model achieves better performance compared to other state-of-the-art methods and significantly outperforms the base model without transferring opinions knowledge.
arXiv Detail & Related papers (2020-01-07T11:50:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.