A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning
- URL: http://arxiv.org/abs/2505.16341v2
- Date: Thu, 03 Jul 2025 11:12:29 GMT
- Title: A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning
- Authors: Yaxin Hou, Yuheng Jia,
- Abstract summary: We study the long-tailed semi-supervised learning (LTSSL) with distribution mismatch, where the class distribution of the labeled training data follows a long-tailed distribution.<n>We propose a dynamic expert assignment module that can estimate the class membership of samples.<n>We show that integrating different experts' strengths will lead to a smaller generalization error bound.
- Score: 18.911712371699263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the long-tailed semi-supervised learning (LTSSL) with distribution mismatch, where the class distribution of the labeled training data follows a long-tailed distribution and mismatches with that of the unlabeled training data. Most existing methods introduce auxiliary classifiers (experts) to model various unlabeled data distributions and produce pseudo-labels, but the expertises of various experts are not fully utilized. We observe that different experts are good at predicting different intervals of samples, e.g., long-tailed expert is skilled in samples located in the head interval and uniform expert excels in samples located in the medium interval. Therefore, we propose a dynamic expert assignment module that can estimate the class membership (i.e., head, medium, or tail class) of samples, and dynamically assigns suitable expert to each sample based on the estimated membership to produce high-quality pseudo-label in the training phase and produce prediction in the testing phase. We also theoretically reveal that integrating different experts' strengths will lead to a smaller generalization error bound. Moreover, we find that the deeper features are more biased toward the head class but with more discriminative ability, while the shallower features are less biased but also with less discriminative ability. We, therefore, propose a multi-depth feature fusion module to utilize different depth features to mitigate the model bias. Our method demonstrates its effectiveness through comprehensive experiments on the CIFAR-10-LT, STL-10-LT, and SVHN-LT datasets across various settings. The code is available at https://github.com/yaxinhou/Meta-Expert.
Related papers
- Multimodal Distillation-Driven Ensemble Learning for Long-Tailed Histopathology Whole Slide Images Analysis [16.01677300903562]
Multiple Instance Learning (MIL) plays a significant role in computational pathology, enabling weakly supervised analysis of Whole Slide Image (WSI) datasets.<n>We propose an ensemble learning method based on MIL, which employs expert decoders with shared aggregators to learn diverse distributions.<n>We introduce a multimodal distillation framework that leverages text encoders pre-trained on pathology-text pairs to distill knowledge.<n>Our method, MDE-MIL, integrates multiple expert branches focusing on specific data distributions to address long-tailed issues.
arXiv Detail & Related papers (2025-03-02T14:31:45Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Three Heads Are Better Than One: Complementary Experts for Long-Tailed Semi-supervised Learning [74.44500692632778]
We propose a novel method named ComPlementary Experts (CPE) to model various class distributions.
CPE achieves state-of-the-art performances on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT dataset benchmarks.
arXiv Detail & Related papers (2023-12-25T11:54:07Z) - Long-Tailed Visual Recognition via Self-Heterogeneous Integration with
Knowledge Excavation [53.94265240561697]
We propose a novel MoE-based method called Self-Heterogeneous Integration with Knowledge Excavation (SHIKE)
SHIKE achieves the state-of-the-art performance of 56.3%, 60.3%, 75.4%, and 41.9% on CIFAR100-LT (IF100), ImageNet-LT, iNaturalist 2018, and Places-LT, respectively.
arXiv Detail & Related papers (2023-04-03T18:13:39Z) - Balanced Product of Calibrated Experts for Long-Tailed Recognition [13.194151879344487]
Many real-world recognition problems are characterized by long-tailed label distributions.
In this work, we take an analytical approach and extend the notion of logit adjustment to ensembles to form a Balanced Product of Experts (BalPoE)
We show how to properly define these distributions and combine the experts in order to achieve unbiased predictions.
arXiv Detail & Related papers (2022-06-10T17:59:02Z) - Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse
Experts with Self-Supervision [85.07855130048951]
We study a more practical task setting, called test-agnostic long-tailed recognition, where the training class distribution is long-tailed.
We propose a new method, called Test-time Aggregating Diverse Experts (TADE), that trains diverse experts to excel at handling different test distributions.
We theoretically show that our method has provable ability to simulate unknown test class distributions.
arXiv Detail & Related papers (2021-07-20T04:10:31Z) - Multi-Class Data Description for Out-of-distribution Detection [25.853322158250435]
Deep-MCDD is effective to detect out-of-distribution (OOD) samples as well as classify in-distribution (ID) samples.
By integrating the concept of Gaussian discriminant analysis into deep neural networks, we propose a deep learning objective to learn class-conditional distributions.
arXiv Detail & Related papers (2021-04-02T08:41:51Z) - Long-tailed Recognition by Routing Diverse Distribution-Aware Experts [64.71102030006422]
We propose a new long-tailed classifier called RoutIng Diverse Experts (RIDE)
It reduces the model variance with multiple experts, reduces the model bias with a distribution-aware diversity loss, reduces the computational cost with a dynamic expert routing module.
RIDE outperforms the state-of-the-art by 5% to 7% on CIFAR100-LT, ImageNet-LT and iNaturalist 2018 benchmarks.
arXiv Detail & Related papers (2020-10-05T06:53:44Z) - Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification [106.08067870620218]
We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-06T12:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.