RICASSO: Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure
- URL: http://arxiv.org/abs/2410.10548v1
- Date: Mon, 14 Oct 2024 14:29:32 GMT
- Title: RICASSO: Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure
- Authors: Xuan Zhang, Sin Chee Chin, Tingxuan Gao, Wenming Yang,
- Abstract summary: Deep learning models often face challenges from both imbalanced (long-tailed) and out-of-distribution (OOD) data.
Our research shows that data mixing can generate pseudo-OOD data that exhibit the features of both in-distribution (ID) data and OOD data.
We propose a unified framework called Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure (RICASSO)
- Score: 21.809270017579806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In real-world scenarios, deep learning models often face challenges from both imbalanced (long-tailed) and out-of-distribution (OOD) data. However, existing joint methods rely on real OOD data, which leads to unnecessary trade-offs. In contrast, our research shows that data mixing, a potent augmentation technique for long-tailed recognition, can generate pseudo-OOD data that exhibit the features of both in-distribution (ID) data and OOD data. Therefore, by using mixed data instead of real OOD data, we can address long-tailed recognition and OOD detection holistically. We propose a unified framework called Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure (RICASSO), where "self-supervised" denotes that we only use ID data for outlier exposure. RICASSO includes three main strategies: Norm-Odd-Duality-Based Outlier Exposure: Uses mixed data as pseudo-OOD data, enabling simultaneous ID data rebalancing and outlier exposure through a single loss function. Ambiguity-Aware Logits Adjustment: Utilizes the ambiguity of ID data to adaptively recalibrate logits. Contrastive Boundary-Center Learning: Combines Virtual Boundary Learning and Dual-Entropy Center Learning to use mixed data for better feature separation and clustering, with Representation Consistency Learning for robustness. Extensive experiments demonstrate that RICASSO achieves state-of-the-art performance in long-tailed recognition and significantly improves OOD detection compared to our baseline (27% improvement in AUROC and 61% reduction in FPR on the iNaturalist2018 dataset). On iNaturalist2018, we even outperforms methods using real OOD data. The code will be made public soon.
Related papers
- Out-of-Distribution Learning with Human Feedback [26.398598663165636]
This paper presents a novel framework for OOD learning with human feedback.
Our framework capitalizes on the freely available unlabeled data in the wild.
By exploiting human feedback, we enhance the robustness and reliability of machine learning models.
arXiv Detail & Related papers (2024-08-14T18:49:27Z) - OAML: Outlier Aware Metric Learning for OOD Detection Enhancement [5.357756138014614]
Out-of-distribution (OOD) detection methods have been developed to identify objects that a model has not seen during training.
The Outlier Exposure (OE) methods use auxiliary datasets to train OOD detectors directly.
We propose the Outlier Aware Metric Learning (OAML) framework to tackle the collection and learning of representative OOD samples.
arXiv Detail & Related papers (2024-06-24T11:01:43Z) - How Does Unlabeled Data Provably Help Out-of-Distribution Detection? [63.41681272937562]
Unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and out-of-distribution (OOD) data.
This paper introduces a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness.
arXiv Detail & Related papers (2024-02-05T20:36:33Z) - EAT: Towards Long-Tailed Out-of-Distribution Detection [55.380390767978554]
This paper addresses the challenging task of long-tailed OOD detection.
The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes.
We propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes, and (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data.
arXiv Detail & Related papers (2023-12-14T13:47:13Z) - Out-of-distribution Detection Learning with Unreliable
Out-of-distribution Sources [73.28967478098107]
Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data.
It is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning OOD patterns.
We propose a data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation.
arXiv Detail & Related papers (2023-11-06T16:26:52Z) - Out-of-distribution Detection with Implicit Outlier Transformation [72.73711947366377]
Outlier exposure (OE) is powerful in out-of-distribution (OOD) detection.
We propose a novel OE-based approach that makes the model perform well for unseen OOD situations.
arXiv Detail & Related papers (2023-03-09T04:36:38Z) - Uncertainty-Estimation with Normalized Logits for Out-of-Distribution
Detection [35.539218522504605]
Uncertainty-Estimation with Normalized Logits (UE-NL) is a robust learning method for OOD detection.
UE-NL treats every ID sample equally by predicting the uncertainty score of input data.
It is more robust to noisy ID data that may be misjudged as OOD data by other methods.
arXiv Detail & Related papers (2023-02-15T11:57:09Z) - Calibrated ensembles can mitigate accuracy tradeoffs under distribution
shift [108.30303219703845]
We find that ID-calibrated ensembles outperforms prior state-of-the-art (based on self-training) on both ID and OOD accuracy.
We analyze this method in stylized settings, and identify two important conditions for ensembles to perform well both ID and OOD.
arXiv Detail & Related papers (2022-07-18T23:14:44Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - They are Not Completely Useless: Towards Recycling Transferable
Unlabeled Data for Class-Mismatched Semi-Supervised Learning [61.46572463531167]
Semi-Supervised Learning (SSL) with mismatched classes deals with the problem that the classes-of-interests in the limited labeled data is only a subset of the classes in massive unlabeled data.
This paper proposes a "Transferable OOD data Recycling" (TOOR) method to enrich the information for conducting class-mismatched SSL.
arXiv Detail & Related papers (2020-11-27T02:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.