Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
- URL: http://arxiv.org/abs/2511.22344v1
- Date: Thu, 27 Nov 2025 11:28:18 GMT
- Title: Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
- Authors: Denis Huseljic, Marek Herde, Lukas Rauch, Paul Hahn, Bernhard Sick,
- Abstract summary: We introduce REFINE, an ensemble active learning (AL) method that combines multiple strategies without knowing in advance which will perform best.<n>In each AL cycle, REFINE operates in two stages: Progressive filtering iteratively refines the unlabeled pool by considering an ensemble of AL strategies.<n>Coverage-based selection then chooses a final batch from this refined pool, ensuring all previously identified notions of value are accounted for.
- Score: 8.368114553774065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing active learning (AL) strategies capture fundamentally different notions of data value, e.g., uncertainty or representativeness. Consequently, the effectiveness of strategies can vary substantially across datasets, models, and even AL cycles. Committing to a single strategy risks suboptimal performance, as no single strategy dominates throughout the entire AL process. We introduce REFINE, an ensemble AL method that combines multiple strategies without knowing in advance which will perform best. In each AL cycle, REFINE operates in two stages: (1) Progressive filtering iteratively refines the unlabeled pool by considering an ensemble of AL strategies, retaining promising candidates capturing different notions of value. (2) Coverage-based selection then chooses a final batch from this refined pool, ensuring all previously identified notions of value are accounted for. Extensive experiments across 6 classification datasets and 3 foundation models show that REFINE consistently outperforms individual strategies and existing ensemble methods. Notably, progressive filtering serves as a powerful preprocessing step that improves the performance of any individual AL strategy applied to the refined pool, which we demonstrate on an audio spectrogram classification use case. Finally, the ensemble of REFINE can be easily extended with upcoming state-of-the-art AL strategies.
Related papers
- Game-Theoretic Co-Evolution for LLM-Based Heuristic Discovery [37.96481049421407]
Large language models (LLMs) have enabled rapid progress in automatic discovery.<n>We propose a game-theoretic framework that reframes discovery as a program level co-evolution between solver and instance generator.
arXiv Detail & Related papers (2026-01-30T12:14:52Z) - Integrating Diverse Assignment Strategies into DETRs [61.61489761918158]
Label assignment is a critical component in object detectors, particularly within DETR-style frameworks.<n>We propose LoRA-DETR, a flexible and lightweight framework that seamlessly integrates diverse assignment strategies into any DETR-style detector.
arXiv Detail & Related papers (2026-01-14T07:28:54Z) - Coverage Improvement and Fast Convergence of On-policy Preference Learning [67.36750525893514]
Online on-policy preference learning algorithms for language model alignment can significantly outperform their offline counterparts.<n>We analyze how the sampling policy's coverage evolves throughout on-policy training.<n>We develop principled on-policy schemes for reward distillation in the general function class setting.
arXiv Detail & Related papers (2026-01-13T10:46:06Z) - MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization [56.074760766965085]
Group-Relative Policy Optimization has emerged as an efficient paradigm for aligning Large Language Models (LLMs)<n>We propose MAESTRO, which treats reward scalarization as a dynamic latent policy, leveraging the model's terminal hidden states as a semantic bottleneck.<n>We formulate this as a contextual bandit problem within a bi-level optimization framework, where a lightweight Conductor network co-evolves with the policy by utilizing group-relative advantages as a meta-reward signal.
arXiv Detail & Related papers (2026-01-12T05:02:48Z) - WaveFuse-AL: Cyclical and Performance-Adaptive Multi-Strategy Active Learning for Medical Images [0.7933039558471408]
We propose Cyclical and Performance-Adaptive Multi-Strategy Active Learning (WaveFuse-AL)<n>WaveFuse-AL fuses multiple established acquisition strategies-BALD, BADGE, Entropy, and CoreSet throughout the learning process.<n> Experimental results demonstrate that WaveFuse-AL consistently outperforms both single-strategy and alternating-strategy baselines.
arXiv Detail & Related papers (2025-11-19T05:23:23Z) - Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography [21.817549738509346]
Auto-Stega is a framework for self-evolving steganographic strategies.<n>It generates, evaluating, summarizing, and updating strategies at inference time.<n>To handle high embedding rates, we introduce PC-DNTE, a plug-and-play algorithm.
arXiv Detail & Related papers (2025-10-08T01:32:59Z) - No Free Lunch in Active Learning: LLM Embedding Quality Dictates Query Strategy Success [1.950171084881346]
Large language models (LLMs) capable of producing general-purpose representations lets us revisit the practicality of deep active learning (AL)<n>This study establishes a benchmark and systematically investigates the influence of LLM embedding quality on query strategies in deep AL.
arXiv Detail & Related papers (2025-05-18T10:38:26Z) - Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition [0.0]
This paper introduces a novel two-stage active learning (AL) pipeline for automatic speech recognition (ASR)<n>The first stage utilizes unsupervised AL by using x-vectors clustering for diverse sample selection from unlabeled speech data.<n>The second stage incorporates a supervised AL strategy, with a batch AL method specifically developed for ASR.
arXiv Detail & Related papers (2024-05-03T19:24:41Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.<n>In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.<n>We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - SemiReward: A General Reward Model for Semi-supervised Learning [58.47299780978101]
Semi-supervised learning (SSL) has witnessed great progress with various improvements in the self-training framework with pseudo labeling.
Main challenge is how to distinguish high-quality pseudo labels against the confirmation bias.
We propose a Semi-supervised Reward framework (SemiReward) that predicts reward scores to evaluate and filter out high-quality pseudo labels.
arXiv Detail & Related papers (2023-10-04T17:56:41Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - ImitAL: Learned Active Learning Strategy on Synthetic Data [30.595138995552748]
We propose ImitAL, a domain-independent novel query strategy, which encodes AL as a learning-to-rank problem.
We train ImitAL on large-scale simulated AL runs on purely synthetic datasets.
To show that ImitAL was successfully trained, we perform an extensive evaluation comparing our strategy on 13 different datasets.
arXiv Detail & Related papers (2022-08-24T16:17:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.