Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
- URL: http://arxiv.org/abs/2405.07780v1
- Date: Mon, 13 May 2024 14:24:56 GMT
- Title: Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition
- Authors: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang,
- Abstract summary: We argue that the variation in test label distributions can be broken down hierarchically into global and local levels.
We propose a new MoE strategy, $mathsfDirMixE$, which assigns experts to different Dirichlet meta-distributions of the label distribution.
We show that our proposed objective benefits from enhanced generalization by virtue of the variance-based regularization.
- Score: 114.96385572118042
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores test-agnostic long-tail recognition, a challenging long-tail task where the test label distributions are unknown and arbitrarily imbalanced. We argue that the variation in these distributions can be broken down hierarchically into global and local levels. The global ones reflect a broad range of diversity, while the local ones typically arise from milder changes, often focused on a particular neighbor. Traditional methods predominantly use a Mixture-of-Expert (MoE) approach, targeting a few fixed test label distributions that exhibit substantial global variations. However, the local variations are left unconsidered. To address this issue, we propose a new MoE strategy, $\mathsf{DirMixE}$, which assigns experts to different Dirichlet meta-distributions of the label distribution, each targeting a specific aspect of local variations. Additionally, the diversity among these Dirichlet meta-distributions inherently captures global variations. This dual-level approach also leads to a more stable objective function, allowing us to sample different test distributions better to quantify the mean and variance of performance outcomes. Theoretically, we show that our proposed objective benefits from enhanced generalization by virtue of the variance-based regularization. Comprehensive experiments across multiple benchmarks confirm the effectiveness of $\mathsf{DirMixE}$. The code is available at \url{https://github.com/scongl/DirMixE}.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Theory-inspired Label Shift Adaptation via Aligned Distribution Mixture [21.494268411607766]
We propose an innovative label shift framework named as Aligned Distribution Mixture (ADM)
Within this framework, we enhance four typical label shift methods by introducing modifications to the classifier training process.
Considering the distinctiveness of the proposed one-step approach, we develop an efficient bi-level optimization strategy.
arXiv Detail & Related papers (2024-11-04T12:51:57Z) - SoftCVI: Contrastive variational inference with self-generated soft labels [2.5398014196797614]
Variational inference and Markov chain Monte Carlo methods are the predominant tools for this task.
We introduce Soft Contrastive Variational Inference (SoftCVI), which allows a family of variational objectives to be derived through a contrastive estimation framework.
We find that SoftCVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches.
arXiv Detail & Related papers (2024-07-22T14:54:12Z) - Probabilistic Test-Time Generalization by Variational Neighbor-Labeling [62.158807685159736]
This paper strives for domain generalization, where models are trained exclusively on source domains before being deployed on unseen target domains.
Probability pseudo-labeling of target samples to generalize the source-trained model to the target domain at test time.
Variational neighbor labels that incorporate the information of neighboring target samples to generate more robust pseudo labels.
arXiv Detail & Related papers (2023-07-08T18:58:08Z) - Generalized Universal Domain Adaptation with Generative Flow Networks [76.1350941965148]
Generalized Universal Domain Adaptation aims to achieve precise prediction of all target labels including unknown categories.
GUDA bridges the gap between label distribution shift-based and label space mismatch-based variants.
We propose an active domain adaptation algorithm named GFlowDA, which selects diverse samples with probabilities proportional to a reward function.
arXiv Detail & Related papers (2023-05-08T05:34:15Z) - Identifiable Latent Causal Content for Domain Adaptation under Latent Covariate Shift [82.14087963690561]
Multi-source domain adaptation (MSDA) addresses the challenge of learning a label prediction function for an unlabeled target domain.
We present an intricate causal generative model by introducing latent noises across domains, along with a latent content variable and a latent style variable.
The proposed approach showcases exceptional performance and efficacy on both simulated and real-world datasets.
arXiv Detail & Related papers (2022-08-30T11:25:15Z) - Characterizing Generalization under Out-Of-Distribution Shifts in Deep
Metric Learning [32.51394862932118]
We present the ooDML benchmark to characterize generalization under out-of-distribution shifts in DML.
ooDML is designed to probe the generalization performance on much more challenging, diverse train-to-test distribution shifts.
We find that while generalization tends to consistently degrade with difficulty, some methods are better at retaining performance as the distribution shift increases.
arXiv Detail & Related papers (2021-07-20T15:26:09Z) - Capturing Label Distribution: A Case Study in NLI [19.869498599986006]
Post-hoc smoothing of the predicted label distribution to match the expected label entropy is very effective.
We introduce a small amount of examples with multiple references into training.
arXiv Detail & Related papers (2021-02-13T04:14:31Z) - Unveiling Class-Labeling Structure for Universal Domain Adaptation [12.411096265140479]
We employ a probabilistic approach for locating the common label set, where each source class may come from the common label set with a probability.
We propose a simple universal adaptation network (S-UAN) by incorporating the probabilistic structure for the common label set.
Experiments indicate that S-UAN works well in different UDA settings and outperforms the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-10-10T02:13:02Z) - A Sample Selection Approach for Universal Domain Adaptation [94.80212602202518]
We study the problem of unsupervised domain adaption in the universal scenario.
Only some of the classes are shared between the source and target domains.
We present a scoring scheme that is effective in identifying the samples of the shared classes.
arXiv Detail & Related papers (2020-01-14T22:28:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.