Related papers: Self Iterative Label Refinement via Robust Unlabeled Learning

Self Iterative Label Refinement via Robust Unlabeled Learning

URL: http://arxiv.org/abs/2502.12565v1
Date: Tue, 18 Feb 2025 06:04:18 GMT
Title: Self Iterative Label Refinement via Robust Unlabeled Learning
Authors: Hikaru Asano, Tadashi Kozuno, Yukino Baba,
Abstract summary: Self-refinement methods for large language models (LLMs) often suffer from inherent biases and overconfidence.<n>We introduce an iterative refinement pipeline that employs the Unlabeled-Unlabeled learning framework to improve LLM-generated pseudo-labels for classification tasks.<n>Our approach iteratively denoises and refines the initial pseudo-labels, thereby mitigating the adverse effects of internal biases with minimal human supervision.
Score: 9.936885864460896
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large language models (LLMs) have yielded impressive performance on various tasks, yet they often depend on high-quality feedback that can be costly. Self-refinement methods attempt to leverage LLMs' internal evaluation mechanisms with minimal human supervision; however, these approaches frequently suffer from inherent biases and overconfidence, especially in domains where the models lack sufficient internal knowledge, resulting in performance degradation. As an initial step toward enhancing self-refinement for broader applications, we introduce an iterative refinement pipeline that employs the Unlabeled-Unlabeled learning framework to improve LLM-generated pseudo-labels for classification tasks. By exploiting two unlabeled datasets with differing positive class ratios, our approach iteratively denoises and refines the initial pseudo-labels, thereby mitigating the adverse effects of internal biases with minimal human supervision. Evaluations on diverse datasets, including low-resource language corpora, patent classifications, and protein structure categorizations, demonstrate that our method consistently outperforms both initial LLM's classification performance and the self-refinement approaches by cutting-edge models (e.g., GPT-4o and DeepSeek-R1).

Related papers

T2I-Eval-R1: Reinforcement Learning-Driven Reasoning for Interpretable Text-to-Image Evaluation [60.620408007636016]
We propose T2I-Eval-R1, a novel reinforcement learning framework that trains open-source MLLMs using only coarse-grained quality scores.<n>Our approach integrates Group Relative Policy Optimization into the instruction-tuning process, enabling models to generate both scalar scores and interpretable reasoning chains.
arXiv Detail & Related papers (2025-05-23T13:44:59Z)
Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches [46.0474342507327]
We introduce Teach2Eval, an indirect evaluation framework inspired by the Feynman Technique.<n>Our method evaluates a model's multiple abilities to teach weaker student models to perform tasks effectively.
arXiv Detail & Related papers (2025-05-18T06:51:10Z)
Feeding LLM Annotations to BERT Classifiers at Your Own Risk [14.533304890042361]
Using LLM-generated labels to fine-tune smaller encoder-only models for text classification has gained popularity in various settings. We demonstrate how the perennial curse of training on synthetic data manifests itself in this specific setup. Compared to models trained on gold labels, we observe not only the expected performance degradation in accuracy and F1 score, but also increased instability across training runs and premature performance plateaus.
arXiv Detail & Related papers (2025-04-21T20:54:55Z)
Revitalizing Reconstruction Models for Multi-class Anomaly Detection via Class-Aware Contrastive Learning [19.114941437668705]
We propose a plug-and-play modification by incorporating class-aware contrastive learning (CL)<n> Experiments across four datasets verify the effectiveness of our approach, yielding significant improvements and superior performance compared to advanced methods.
arXiv Detail & Related papers (2024-12-06T04:31:09Z)
LPLgrad: Optimizing Active Learning Through Gradient Norm Sample Selection and Auxiliary Model Training [2.762397703396293]
Loss Prediction Loss with Gradient Norm (LPLgrad) is designed to quantify model uncertainty effectively and improve the accuracy of image classification tasks. LPLgrad operates in two distinct phases: (i) em Training Phase aims to predict the loss for input features by jointly training a main model and an auxiliary model. This dual-model approach enhances the ability to extract complex input features and learn intrinsic patterns from the data effectively.
arXiv Detail & Related papers (2024-11-20T18:12:59Z)
Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration [60.95748658638956]
This paper introduces the Multi-Label Confidence task, aiming to provide well-calibrated confidence scores in multi-label scenarios. Existing single-label calibration methods fail to account for category correlations, which are crucial for addressing semantic confusion. We propose the Dynamic Correlation Learning and Regularization algorithm, which leverages multi-grained semantic correlations to better model semantic confusion.
arXiv Detail & Related papers (2024-07-09T13:26:21Z)
Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels. By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data. The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z)
Self-training Large Language Models through Knowledge Detection [26.831873737733737]
Large language models (LLMs) often necessitate extensive labeled datasets and training compute to achieve impressive performance across downstream tasks. This paper explores a self-training paradigm, where the LLM autonomously curates its own labels and selectively trains on unknown data samples. Empirical evaluations demonstrate significant improvements in reducing hallucination in generation across multiple subjects.
arXiv Detail & Related papers (2024-06-17T07:25:09Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models. It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling. This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data. We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z)
Effective Evaluation of Deep Active Learning on Image Classification Tasks [10.27095298129151]
We present a unified re-implementation of state-of-the-art active learning algorithms in the context of image classification. On the positive side, we show that AL techniques are 2x to 4x more label-efficient compared to RS with the use of data augmentation.
arXiv Detail & Related papers (2021-06-16T23:29:39Z)
Two-phase Pseudo Label Densification for Self-training based Domain Adaptation [93.03265290594278]
We propose a novel Two-phase Pseudo Label Densification framework, referred to as TPLD. In the first phase, we use sliding window voting to propagate the confident predictions, utilizing intrinsic spatial-correlations in the images. In the second phase, we perform a confidence-based easy-hard classification. To ease the training process and avoid noisy predictions, we introduce the bootstrapping mechanism to the original self-training loss.
arXiv Detail & Related papers (2020-12-09T02:35:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.