Towards Comparable Active Learning
- URL: http://arxiv.org/abs/2311.18356v1
- Date: Thu, 30 Nov 2023 08:54:32 GMT
- Title: Towards Comparable Active Learning
- Authors: Thorben Werner, Johannes Burchert, Lars Schmidt-Thieme
- Abstract summary: We show that the reported lifts in recent literature generalize poorly to other domains leading to an inconclusive landscape in Active Learning research.
This paper addresses these issues by providing an Active Learning framework for a fair comparison of algorithms across different tasks and domains, as well as a fast and perform oracleant algorithm for evaluation.
- Score: 6.579888565581481
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active Learning has received significant attention in the field of machine
learning for its potential in selecting the most informative samples for
labeling, thereby reducing data annotation costs. However, we show that the
reported lifts in recent literature generalize poorly to other domains leading
to an inconclusive landscape in Active Learning research. Furthermore, we
highlight overlooked problems for reproducing AL experiments that can lead to
unfair comparisons and increased variance in the results. This paper addresses
these issues by providing an Active Learning framework for a fair comparison of
algorithms across different tasks and domains, as well as a fast and performant
oracle algorithm for evaluation. To the best of our knowledge, we propose the
first AL benchmark that tests algorithms in 3 major domains: Tabular, Image,
and Text. We report empirical results for 6 widely used algorithms on 7
real-world and 2 synthetic datasets and aggregate them into a domain-specific
ranking of AL algorithms.
Related papers
- RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation [54.707460684650584]
Large Language Models (LLMs) demonstrate human-level capabilities in dialogue, reasoning, and knowledge retention.
Current research addresses this bottleneck by equipping LLMs with external knowledge, a technique known as Retrieval Augmented Generation (RAG)
RAGLAB is a modular and research-oriented open-source library that reproduces 6 existing algorithms and provides a comprehensive ecosystem for investigating RAG algorithms.
arXiv Detail & Related papers (2024-08-21T07:20:48Z) - A Cross-Domain Benchmark for Active Learning [5.359176539960004]
Active Learning deals with identifying the most informative samples for labeling to reduce data annotation costs.
We propose CDALBench, the first active learning benchmark which includes tasks in computer vision and natural language processing.
We show, that both the cross-domain character and a large amount of repetitions are crucial for sophisticated evaluation of AL research.
arXiv Detail & Related papers (2024-08-01T09:57:48Z) - From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models [63.188607839223046]
This survey focuses on the benefits of scaling compute during inference.
We explore three areas under a unified mathematical formalism: token-level generation algorithms, meta-generation algorithms, and efficient generation.
arXiv Detail & Related papers (2024-06-24T17:45:59Z) - DIRECT: Deep Active Learning under Imbalance and Label Noise [15.571923343398657]
We conduct the first study of active learning under both class imbalance and label noise.
We propose a novel algorithm that robustly identifies the class separation threshold and annotates the most uncertain examples.
Our results demonstrate that DIRECT can save more than 60% of the annotation budget compared to state-of-art active learning algorithms.
arXiv Detail & Related papers (2023-12-14T18:18:34Z) - Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Effective Evaluation of Deep Active Learning on Image Classification
Tasks [10.27095298129151]
We present a unified re-implementation of state-of-the-art active learning algorithms in the context of image classification.
On the positive side, we show that AL techniques are 2x to 4x more label-efficient compared to RS with the use of data augmentation.
arXiv Detail & Related papers (2021-06-16T23:29:39Z) - TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for
Unsupervised Sentence Embedding Learning [53.32740707197856]
We present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE)
It can achieve up to 93.1% of the performance of in-domain supervised approaches.
arXiv Detail & Related papers (2021-04-14T17:02:18Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Towards Understanding the Behaviors of Optimal Deep Active Learning
Algorithms [19.65665942630067]
Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.
There is little study on what the optimal AL looks like, which would help researchers understand where their models fall short.
We present a simulated annealing algorithm to search for this optimal oracle and analyze it for several tasks.
arXiv Detail & Related papers (2020-12-29T22:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.