Compositional Attribute Imbalance in Vision Datasets
- URL: http://arxiv.org/abs/2506.14418v1
- Date: Tue, 17 Jun 2025 11:28:07 GMT
- Title: Compositional Attribute Imbalance in Vision Datasets
- Authors: Jiayi Chen, Yanbiao Ma, Andi Zhang, Weidong Tang, Wei Dai, Bowei Liu,
- Abstract summary: We introduce a CLIP-based framework to construct a visual attribute dictionary, enabling automatic evaluation of image attributes.<n>By analyzing both single-attribute imbalance and compositional attribute imbalance, we reveal how the rarity of attributes affects model performance.<n>Our research highlights the importance of modeling visual attribute distributions and provides a scalable solution for long-tail image classification tasks.
- Score: 7.018788111043557
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual attribute imbalance is a common yet underexplored issue in image classification, significantly impacting model performance and generalization. In this work, we first define the first-level and second-level attributes of images and then introduce a CLIP-based framework to construct a visual attribute dictionary, enabling automatic evaluation of image attributes. By systematically analyzing both single-attribute imbalance and compositional attribute imbalance, we reveal how the rarity of attributes affects model performance. To tackle these challenges, we propose adjusting the sampling probability of samples based on the rarity of their compositional attributes. This strategy is further integrated with various data augmentation techniques (such as CutMix, Fmix, and SaliencyMix) to enhance the model's ability to represent rare attributes. Extensive experiments on benchmark datasets demonstrate that our method effectively mitigates attribute imbalance, thereby improving the robustness and fairness of deep neural networks. Our research highlights the importance of modeling visual attribute distributions and provides a scalable solution for long-tail image classification tasks.
Related papers
- Predicting Scores of Various Aesthetic Attribute Sets by Learning from
Overall Score Labels [54.63611854474985]
In this paper, we propose to replace image attribute labels with feature extractors.
We use networks from different tasks to provide attribute features to our F2S model.
Our method makes it feasible to learn meaningful attribute scores for various aesthetic attribute sets in different types of images with only overall aesthetic scores.
arXiv Detail & Related papers (2023-12-06T01:41:49Z) - Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval [65.43522019468976]
We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes.
We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors.
Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
arXiv Detail & Related papers (2023-11-21T08:20:38Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Attribute-Centric Compositional Text-to-Image Generation [45.12516226662346]
ACTIG is an attribute-centric compositional text-to-image generation framework.
We present an attribute-centric feature augmentation and a novel image-free training scheme.
We validate our framework on the CelebA-HQ and CUB datasets.
arXiv Detail & Related papers (2023-01-04T03:03:08Z) - CAT: Controllable Attribute Translation for Fair Facial Attribute
Classification [14.191129493685212]
In facial attribute classification, dataset bias stems from both protected attribute level and facial attribute level.
We propose an effective pipeline to generate high-quality and sufficient facial images with desired facial attributes.
Our method outperforms both resampling and balanced dataset construction to address dataset bias.
arXiv Detail & Related papers (2022-09-14T18:04:20Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Learning to Infer Unseen Attribute-Object Compositions [55.58107964602103]
A graph-based model is proposed that can flexibly recognize both single- and multi-attribute-object compositions.
We build a large-scale Multi-Attribute dataset with 116,099 images and 8,030 composition categories.
arXiv Detail & Related papers (2020-10-27T14:57:35Z) - Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition [102.45926816660665]
We propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.
The principle lies in that attribute features are shared among fine-grained sub-categories, and can be seamlessly transferred among images.
arXiv Detail & Related papers (2020-04-06T14:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.