Related papers: Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

URL: http://arxiv.org/abs/2311.00966v1
Date: Thu, 2 Nov 2023 03:24:55 GMT
Title: Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms
Authors: Haoxiang Wang, Gargi Balasubramaniam, Haozhe Si, Bo Li, Han Zhao
Abstract summary: Domain generalization asks for trained models over a set of training environments to generalize well in unseen test environments. We propose Subspace Recovery (ISR): a new class of algorithms to achieve provable regression problems. ISR can be used as post-processing methods for neural nets such as neural nets Empirically, we demonstrate the superior performance of our ISRs on synthetic benchmarks.
Score: 14.248005245508432
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Domain generalization asks for models trained over a set of training environments to generalize well in unseen test environments. Recently, a series of algorithms such as Invariant Risk Minimization (IRM) have been proposed for domain generalization. However, Rosenfeld et al. (2021) shows that in a simple linear data model, even if non-convexity issues are ignored, IRM and its extensions cannot generalize to unseen environments with less than $d_s+1$ training environments, where $d_s$ is the dimension of the spurious-feature subspace. In this work, we propose Invariant-feature Subspace Recovery (ISR): a new class of algorithms to achieve provable domain generalization across the settings of classification and regression problems. First, in the binary classification setup of Rosenfeld et al. (2021), we show that our first algorithm, ISR-Mean, can identify the subspace spanned by invariant features from the first-order moments of the class-conditional distributions, and achieve provable domain generalization with $d_s+1$ training environments. Our second algorithm, ISR-Cov, further reduces the required number of training environments to $O(1)$ using the information of second-order moments. Notably, unlike IRM, our algorithms bypass non-convexity issues and enjoy global convergence guarantees. Next, we extend ISR-Mean to the more general setting of multi-class classification and propose ISR-Multiclass, which leverages class information and provably recovers the invariant-feature subspace with $\lceil d_s/k\rceil+1$ training environments for $k$-class classification. Finally, for regression problems, we propose ISR-Regression that can identify the invariant-feature subspace with $d_s+1$ training environments. Empirically, we demonstrate the superior performance of our ISRs on synthetic benchmarks. Further, ISR can be used as post-processing methods for feature extractors such as neural nets.

Related papers

Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions. We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z)
On Reductions and Representations of Learning Problems in Euclidean Spaces [15.07787640047213]
Many practical prediction algorithms represent inputs in Euclidean space and replace the discrete 0/1 classification loss with a real-trivial surrogate loss. We develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness.
arXiv Detail & Related papers (2024-11-16T12:09:37Z)
Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [69.15976031704687]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability. IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation [69.0695698566235]
We study reinforcement learning with linear function approximation and adversarially changing cost functions. We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback.
arXiv Detail & Related papers (2023-01-30T17:26:39Z)
Counterfactual Supervision-based Information Bottleneck for Out-of-Distribution Generalization [40.94431121318241]
We show that the invariant risk minimization algorithm (IB-IRM) is not sufficient for learning invariant features in linear classification problems. We propose a textitCounterfactual Supervision-based Information Bottleneck (CSIB) learning algorithm that provably recovers the invariant features.
arXiv Detail & Related papers (2022-08-16T15:26:00Z)
A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning [113.75991721607174]
We introduce an interventional prediction module to estimate the probability of two estimated $hatz_i, hatz_j$ belonging to the same environment. We empirically show that $hatZ$ estimated by our method enjoy less redundant information than previous methods.
arXiv Detail & Related papers (2022-06-09T15:01:36Z)
Federated Reinforcement Learning with Environment Heterogeneity [30.797692838836277]
We study a Federated Reinforcement Learning (FedRL) problem in which $n$ agents collaboratively learn a single policy without sharing the trajectories they collected during agent-environment interaction. We propose two federated RL algorithms, textttQAvg and textttPAvg.
arXiv Detail & Related papers (2022-04-06T07:21:00Z)
Provable Domain Generalization via Invariant-Feature Subspace Recovery [18.25619572103648]
In this paper, we propose to achieve domain generalization with Invariant- Subspace Recovery (ISR) Unlike training IRM, our algorithms bypass non-variantity issues and enjoy global convergence. In addition, on three real-world image datasets, we show that ISR- can be used as a simple yet effective post-processing method.
arXiv Detail & Related papers (2022-01-30T21:22:47Z)
Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments [55.24895403089543]
Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments. We present a new algorithm based on performing iterative feature matching that is guaranteed with high probability to yield a predictor that generalizes after seeing only $O(logd_s)$ environments.
arXiv Detail & Related papers (2021-06-18T04:39:19Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA) We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.