Related papers: HYPO: Hyperspherical Out-of-Distribution Generalization

HYPO: Hyperspherical Out-of-Distribution Generalization

URL: http://arxiv.org/abs/2402.07785v3
Date: Sun, 03 Nov 2024 08:00:15 GMT
Title: HYPO: Hyperspherical Out-of-Distribution Generalization
Authors: Haoyue Bai, Yifei Ming, Julian Katz-Samuels, Yixuan Li,
Abstract summary: We propose a novel framework that provably learns domain-invariant representations in a hyperspherical space. In particular, our hyperspherical learning algorithm is guided by intra-class variation and inter-class separation principles. We demonstrate that our approach outperforms competitive baselines and achieves superior performance.
Score: 35.02297657453378
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Out-of-distribution (OOD) generalization is critical for machine learning models deployed in the real world. However, achieving this can be fundamentally challenging, as it requires the ability to learn invariant features across different domains or environments. In this paper, we propose a novel framework HYPO (HYPerspherical OOD generalization) that provably learns domain-invariant representations in a hyperspherical space. In particular, our hyperspherical learning algorithm is guided by intra-class variation and inter-class separation principles -- ensuring that features from the same class (across different training domains) are closely aligned with their class prototypes, while different class prototypes are maximally separated. We further provide theoretical justifications on how our prototypical learning objective improves the OOD generalization bound. Through extensive experiments on challenging OOD benchmarks, we demonstrate that our approach outperforms competitive baselines and achieves superior performance. Code is available at https://github.com/deeplearning-wisc/hypo.

Related papers

AnyBody: A Benchmark Suite for Cross-Embodiment Manipulation [59.671764778486995]
Generalizing control policies to novel embodiments remains a fundamental challenge in enabling scalable and transferable learning in robotics.<n>We introduce a benchmark for learning cross-embodiment manipulation, focusing on two foundational tasks-reach and push-across a diverse range of morphologies.<n>We evaluate the ability of different RL policies to learn from multiple morphologies and to generalize to novel ones.
arXiv Detail & Related papers (2025-05-21T00:21:38Z)
Casual Inference via Style Bias Deconfounding for Domain Generalization [28.866189619091227]
We introduce Style Deconfounding Causal Learning, a novel causal inference-based framework designed to explicitly address style as a confounding factor. Our approaches begin with constructing a structural causal model (SCM) tailored to the domain generalization problem and applies a backdoor adjustment strategy to account for style influence. Building on this foundation, we design a style-guided expert module (SGEM) to adaptively clusters style distributions during training, capturing the global confounding style. A back-door causal learning module (BDCL) performs causal interventions during feature extraction, ensuring fair integration of global confounding styles into sample predictions, effectively reducing style bias
arXiv Detail & Related papers (2025-03-21T04:52:31Z)
Raising the Bar in Graph OOD Generalization: Invariant Learning Beyond Explicit Environment Modeling [58.15601237755505]
Real-world graph data often exhibit diverse and shifting environments that traditional models fail to generalize across. We propose a novel method termed Multi-Prototype Hyperspherical Invariant Learning (MPHIL) MPHIL achieves state-of-the-art performance, significantly outperforming existing methods across graph data from various domains and with different distribution shifts.
arXiv Detail & Related papers (2025-02-15T07:40:14Z)
Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning [5.920971285288677]
Human Activity Recognition (HAR) aims to recognize activities by training models on massive sensor data. One crucial aspect of HAR that has been largely overlooked is that the test sets may have different distributions from training sets. We propose a Categorical Concept Invariant Learning framework for generalizable activity recognition.
arXiv Detail & Related papers (2024-12-18T08:18:03Z)
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods. ODRL contains four experimental settings where the source and target domains can be either online or offline. We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z)
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss. We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z)
Self-Supervised Learning for Covariance Estimation [3.04585143845864]
We propose to globally learn a neural network that will then be applied locally at inference time. The architecture is based on the popular attention mechanism. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.
arXiv Detail & Related papers (2024-03-13T16:16:20Z)
Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization [12.311957227670598]
Domain Generalization (DG) seeks to train models across multiple domains and test them on unseen ones. We introduce a novel approach, namely, Selective Cross-Modality Distillation for Domain Generalization (SCMD) SCMD leverages the capabilities of large vision-language models, specifically CLIP, to train a more efficient model. We assess SCMD's performance on various benchmarks, where it empowers a ResNet50 to deliver state-of-the-art performance.
arXiv Detail & Related papers (2023-11-26T00:06:12Z)
CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances. We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data. Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z)
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification [68.93357975024773]
We shed light on the fundamental cause of model overconfidence on OOD. We also extend the Lottery Ticket Hypothesis to open-world scenarios.
arXiv Detail & Related papers (2022-10-13T14:58:35Z)
Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training. Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration. We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z)
Dual Path Structural Contrastive Embeddings for Learning Novel Objects [6.979491536753043]
Recent research shows that gaining information on a good feature space can be an effective solution to achieve favorable performance on few-shot tasks. We propose a simple but effective paradigm that decouples the tasks of learning feature representations and classifiers. Our method can still achieve promising results for both standard and generalized few-shot problems in either an inductive or transductive inference setting.
arXiv Detail & Related papers (2021-12-23T04:43:31Z)
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization. We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector. Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z)
Learning to Learn Single Domain Generalization [18.72451358284104]
We propose a new method named adversarial domain augmentation to solve this Out-of-Distribution (OOD) generalization problem. The key idea is to leverage adversarial training to create "fictitious" yet "challenging" populations. To facilitate fast and desirable domain augmentation, we cast the model training in a meta-learning scheme and use a Wasserstein Auto-Encoder (WAE) to relax the widely used worst-case constraint.
arXiv Detail & Related papers (2020-03-30T04:39:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.