Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method
- URL: http://arxiv.org/abs/2306.02346v2
- Date: Mon, 11 Nov 2024 12:54:35 GMT
- Title: Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method
- Authors: Shuo Ye, Shiming Chen, Ruxin Wang, Tianxu Wu, Jiamiao Xu, Salman Khan, Fahad Shahbaz Khan, Ling Shao,
- Abstract summary: We present a Concept Drift and Long-Tailed Distribution dataset.
The characteristics of instances tend to vary with time and exhibit a long-tailed distribution.
We propose a feature recombination framework to address the learning challenges associated with CDLT.
- Score: 84.68818879525568
- License:
- Abstract: Data is the foundation for the development of computer vision, and the establishment of datasets plays an important role in advancing the techniques of fine-grained visual categorization~(FGVC). In the existing FGVC datasets used in computer vision, it is generally assumed that each collected instance has fixed characteristics and the distribution of different categories is relatively balanced. In contrast, the real world scenario reveals the fact that the characteristics of instances tend to vary with time and exhibit a long-tailed distribution. Hence, the collected datasets may mislead the optimization of the fine-grained classifiers, resulting in unpleasant performance in real applications. Starting from the real-world conditions and to promote the practical progress of fine-grained visual categorization, we present a Concept Drift and Long-Tailed Distribution dataset. Specifically, the dataset is collected by gathering 11195 images of 250 instances in different species for 47 consecutive months in their natural contexts. The collection process involves dozens of crowd workers for photographing and domain experts for labeling. Meanwhile, we propose a feature recombination framework to address the learning challenges associated with CDLT. Experimental results validate the efficacy of our method while also highlighting the limitations of popular large vision-language models (e.g., CLIP) in the context of long-tailed distributions. This emphasizes the significance of CDLT as a benchmark for investigating these challenges.
Related papers
- Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning [16.110763554788445]
We introduce pseudo-labels into self-supervised long-tailed learning, utilizing pseudo-label information to drive a dynamic temperature and re-weighting strategy.
We analyze the lack of quantity awareness in the temperature parameter and use re-weighting to compensate for this deficiency, thereby achieving optimal training patterns at the sample level.
arXiv Detail & Related papers (2024-10-30T10:25:22Z) - Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition [4.192370959537781]
We present a semi-supervised fine-tuning approach designed to improve the performance of pre-trained foundation models on downstream tasks with limited labeled data.
We evaluate our approach on multiple datasets, including MNIST, its augmented variations, CIFAR-10, SVHN, and GalaxyMNIST.
arXiv Detail & Related papers (2024-10-02T22:36:12Z) - Visual Data Diagnosis and Debiasing with Concept Graphs [50.84781894621378]
We present ConBias, a framework for diagnosing and mitigating Concept co-occurrence Biases in visual datasets.
We show that by employing a novel clique-based concept balancing strategy, we can mitigate these imbalances, leading to enhanced performance on downstream tasks.
arXiv Detail & Related papers (2024-09-26T16:59:01Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Generalized Representations Learning for Time Series Classification [28.230863650758447]
We argue that the temporal complexity attributes to the unknown latent distributions within time series classification.
We present experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition.
arXiv Detail & Related papers (2022-09-15T03:36:31Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Input-Output Balanced Framework for Long-tailed LiDAR Semantic
Segmentation [12.639524717464509]
We propose an input-output balanced framework to handle the issue of long-tailed distribution.
For the input space, we synthesize these tailed instances from mesh models and well simulate the position and density distribution of LiDAR scan.
For the output space, a multi-head block is proposed to group different categories based on their shapes and instance amounts.
arXiv Detail & Related papers (2021-03-26T05:42:11Z) - Domain Adaptive Transfer Learning on Visual Attention Aware Data
Augmentation for Fine-grained Visual Categorization [3.5788754401889014]
We perform domain adaptive knowledge transfer via fine-tuning on our base network model.
We show competitive improvement on accuracies by using attention-aware data augmentation techniques.
Our method achieves state-of-the-art results in multiple fine-grained classification datasets.
arXiv Detail & Related papers (2020-10-06T22:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.