Vision-language Assisted Attribute Learning
- URL: http://arxiv.org/abs/2312.07009v2
- Date: Fri, 15 Dec 2023 02:40:29 GMT
- Title: Vision-language Assisted Attribute Learning
- Authors: Kongming Liang, Xinran Wang, Rui Wang, Donghui Gao, Ling Jin, Weidong
Liu, Xiatian Zhu, Zhanyu Ma, Jun Guo
- Abstract summary: Attribute labeling at large scale is typically incomplete and partial.
Existing attribute learning methods often treat the missing labels as negative or simply ignore them all during training.
We leverage the available vision-language knowledge to explicitly disclose the missing labels for enhancing model learning.
- Score: 53.60196963381315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attribute labeling at large scale is typically incomplete and partial, posing
significant challenges to model optimization. Existing attribute learning
methods often treat the missing labels as negative or simply ignore them all
during training, either of which could hamper the model performance to a great
extent. To overcome these limitations, in this paper we leverage the available
vision-language knowledge to explicitly disclose the missing labels for
enhancing model learning. Given an image, we predict the likelihood of each
missing attribute label assisted by an off-the-shelf vision-language model, and
randomly select to ignore those with high scores in training. Our strategy
strikes a good balance between fully ignoring and negatifying the missing
labels, as these high scores are found to be informative on revealing label
ambiguity. Extensive experiments show that our proposed vision-language
assisted loss can achieve state-of-the-art performance on the newly cleaned VAW
dataset. Qualitative evaluation demonstrates the ability of the proposed method
in predicting more complete attributes.
Related papers
- Leveraging vision-language models for fair facial attribute classification [19.93324644519412]
General-purpose vision-language model (VLM) is a rich knowledge source for common sensitive attributes.
We analyze the correspondence between VLM predicted and human defined sensitive attribute distribution.
Experiments on multiple benchmark facial attribute classification datasets show fairness gains of the model over existing unsupervised baselines.
arXiv Detail & Related papers (2024-03-15T18:37:15Z) - A Self Supervised StyleGAN for Image Annotation and Classification with
Extremely Limited Labels [35.43549147657739]
We propose SS-StyleGAN, a self-supervised approach for image annotation and classification suitable for extremely small annotated datasets.
We show that the proposed method attains strong classification results using small labeled datasets of sizes 50 and even 10.
arXiv Detail & Related papers (2023-12-26T09:46:50Z) - ERASE: Error-Resilient Representation Learning on Graphs for Label Noise
Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance.
ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience.
Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Robust Feature Learning Against Noisy Labels [0.2082426271304908]
Mislabeled samples can significantly degrade the generalization of models.
progressive self-bootstrapping is introduced to minimize the negative impact of supervision from noisy labels.
Experimental results show that our proposed method can efficiently and effectively enhance model robustness under severely noisy labels.
arXiv Detail & Related papers (2023-07-10T02:55:35Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Semi-FairVAE: Semi-supervised Fair Representation Learning with
Adversarial Variational Autoencoder [92.67156911466397]
We propose a semi-supervised fair representation learning approach based on adversarial variational autoencoder.
We use a bias-aware model to capture inherent bias information on sensitive attribute.
We also use a bias-free model to learn debiased fair representations by using adversarial learning to remove bias information from them.
arXiv Detail & Related papers (2022-04-01T15:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.