ActiveDC: Distribution Calibration for Active Finetuning
- URL: http://arxiv.org/abs/2311.07634v3
- Date: Tue, 27 Feb 2024 07:52:16 GMT
- Title: ActiveDC: Distribution Calibration for Active Finetuning
- Authors: Wenshuai Xu, Zhenghui Hu, Yu Lu, Jinzhou Meng, Qingjie Liu, Yunhong
Wang
- Abstract summary: We propose a new method called ActiveDC for the active finetuning tasks.
We calibrate the distribution of the selected samples by exploiting implicit category information in the unlabeled pool.
The results indicate that ActiveDC consistently outperforms the baseline performance in all image classification tasks.
- Score: 36.64444238742072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The pretraining-finetuning paradigm has gained popularity in various computer
vision tasks. In this paradigm, the emergence of active finetuning arises due
to the abundance of large-scale data and costly annotation requirements. Active
finetuning involves selecting a subset of data from an unlabeled pool for
annotation, facilitating subsequent finetuning. However, the use of a limited
number of training samples can lead to a biased distribution, potentially
resulting in model overfitting. In this paper, we propose a new method called
ActiveDC for the active finetuning tasks. Firstly, we select samples for
annotation by optimizing the distribution similarity between the subset to be
selected and the entire unlabeled pool in continuous space. Secondly, we
calibrate the distribution of the selected samples by exploiting implicit
category information in the unlabeled pool. The feature visualization provides
an intuitive sense of the effectiveness of our approach to distribution
calibration. We conducted extensive experiments on three image classification
datasets with different sampling ratios. The results indicate that ActiveDC
consistently outperforms the baseline performance in all image classification
tasks. The improvement is particularly significant when the sampling ratio is
low, with performance gains of up to 10%. Our code will be released.
Related papers
- Dataset Quantization with Active Learning based Adaptive Sampling [11.157462442942775]
We show that maintaining performance is feasible even with uneven sample distributions.
We propose a novel active learning based adaptive sampling strategy to optimize the sample selection.
Our approach outperforms the state-of-the-art dataset compression methods.
arXiv Detail & Related papers (2024-07-09T23:09:18Z) - Discover Your Neighbors: Advanced Stable Test-Time Adaptation in Dynamic World [8.332531696256666]
Discover Your Neighbours (DYN) is the first backward-free approach specialized for dynamic test-time adaptation (TTA)
Our DYN consists of layer-wise instance statistics clustering (LISC) and cluster-aware batch normalization (CABN)
Experimental results validate DYN's robustness and effectiveness, demonstrating maintained performance under dynamic data stream patterns.
arXiv Detail & Related papers (2024-06-08T09:22:32Z) - Training-Free Unsupervised Prompt for Vision-Language Models [27.13778811871694]
We propose Training-Free Unsupervised Prompts (TFUP) to preserve inherent representation capabilities and enhance them with a residual connection to similarity-based prediction probabilities.
TFUP achieves surprising performance, even surpassing the training-base method on multiple classification datasets.
Our TFUP-T achieves new state-of-the-art classification performance compared to unsupervised and few-shot adaptation approaches on multiple benchmarks.
arXiv Detail & Related papers (2024-04-25T05:07:50Z) - Boundary Matters: A Bi-Level Active Finetuning Framework [100.45000039215495]
The concept of active finetuning has emerged, aiming to select the most appropriate samples for model finetuning within a limited budget.
Traditional active learning methods often struggle in this setting due to their inherent bias in batch selection.
We propose a Bi-Level Active Finetuning framework to select the samples for annotation in one shot, which includes two stages: core sample selection for diversity, and boundary sample selection for uncertainty.
arXiv Detail & Related papers (2024-03-15T07:19:15Z) - VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness [56.87603097348203]
VeCAF uses labels and natural language annotations to perform parametric data selection for PVM finetuning.
VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence.
On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning.
arXiv Detail & Related papers (2024-01-15T17:28:37Z) - On the Trade-off of Intra-/Inter-class Diversity for Supervised
Pre-training [72.8087629914444]
We study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset.
With the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity.
arXiv Detail & Related papers (2023-05-20T16:23:50Z) - Active Finetuning: Exploiting Annotation Budget in the
Pretraining-Finetuning Paradigm [132.9949120482274]
This paper focuses on the selection of samples for annotation in the pretraining-finetuning paradigm.
We propose a novel method called ActiveFT for active finetuning task to select a subset of data distributing similarly with the entire unlabeled pool.
Extensive experiments show the leading performance and high efficiency of ActiveFT superior to baselines on both image classification and semantic segmentation.
arXiv Detail & Related papers (2023-03-25T07:17:03Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.