Kernelized Heterogeneous Risk Minimization
- URL: http://arxiv.org/abs/2110.12425v1
- Date: Sun, 24 Oct 2021 12:26:50 GMT
- Title: Kernelized Heterogeneous Risk Minimization
- Authors: Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen
- Abstract summary: We propose a Kernelized Heterogeneous Risk Minimization (KerHRM) algorithm, which achieves both the latent exploration and invariant learning in kernel space.
We theoretically justify our algorithm and empirically validate the effectiveness of our algorithm with extensive experiments.
- Score: 25.5458915855661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to generalize under distributional shifts is essential to
reliable machine learning, while models optimized with empirical risk
minimization usually fail on non-$i.i.d$ testing data. Recently, invariant
learning methods for out-of-distribution (OOD) generalization propose to find
causally invariant relationships with multi-environments. However, modern
datasets are frequently multi-sourced without explicit source labels, rendering
many invariant learning methods inapplicable. In this paper, we propose
Kernelized Heterogeneous Risk Minimization (KerHRM) algorithm, which achieves
both the latent heterogeneity exploration and invariant learning in kernel
space, and then gives feedback to the original neural network by appointing
invariant gradient direction. We theoretically justify our algorithm and
empirically validate the effectiveness of our algorithm with extensive
experiments.
Related papers
- Learning Invariant Molecular Representation in Latent Discrete Space [52.13724532622099]
We propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts.
Our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts.
arXiv Detail & Related papers (2023-10-22T04:06:44Z) - Robust Distributed Learning: Tight Error Bounds and Breakdown Point
under Data Heterogeneity [11.2120847961379]
We consider in this paper a more realistic heterogeneity model, namely (G,B)-gradient dissimilarity, and show that it covers a larger class of learning problems than existing theory.
We also prove a new lower bound on the learning error of any distributed learning algorithm.
arXiv Detail & Related papers (2023-09-24T09:29:28Z) - Environment Diversification with Multi-head Neural Network for Invariant
Learning [7.255121332331688]
This work proposes EDNIL, an invariant learning framework containing a multi-head neural network to absorb data biases.
We show that this framework does not require prior knowledge about environments or strong assumptions about the pre-trained model.
We demonstrate that models trained with EDNIL are empirically more robust against distributional shifts.
arXiv Detail & Related papers (2023-08-17T04:33:38Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - OoD-Bench: Benchmarking and Understanding Out-of-Distribution
Generalization Datasets and Algorithms [28.37021464780398]
We show that existing OoD algorithms that outperform empirical risk minimization on one distribution shift usually have limitations on the other distribution shift.
The new benchmark may serve as a strong foothold that can be resorted to by future OoD generalization research.
arXiv Detail & Related papers (2021-06-07T15:34:36Z) - Heterogeneous Risk Minimization [25.5458915855661]
Invariant learning methods for out-of-distribution generalization have been proposed by leveraging multiple training environments to find invariant relationships.
Modern datasets are assembled by merging data from multiple sources without explicit source labels.
We propose Heterogeneous Risk Minimization (HRM) framework to achieve joint learning of latent heterogeneity among the data and invariant relationship.
arXiv Detail & Related papers (2021-05-09T02:51:36Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Invariant Risk Minimization Games [48.00018458720443]
In this work, we pose such invariant risk minimization as finding the Nash equilibrium of an ensemble game among several environments.
By doing so, we develop a simple training algorithm that uses best response dynamics and equilibria in our experiments, yields similar or better empirical accuracy with much lower variance than the challenging bi-level optimization problem of Arjovsky et al.
arXiv Detail & Related papers (2020-02-11T21:25:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.