Learning Invariant Representations using Inverse Contrastive Loss
- URL: http://arxiv.org/abs/2102.08343v1
- Date: Tue, 16 Feb 2021 18:29:28 GMT
- Title: Learning Invariant Representations using Inverse Contrastive Loss
- Authors: Aditya Kumar Akash, Vishnu Suresh Lokhande, Sathya N. Ravi, Vikas
Singh
- Abstract summary: We introduce a class of losses for learning representations that are invariant to some extraneous variable of interest.
We show that if the extraneous variable is binary, then optimizing ICL is equivalent to optimizing a regularized MMD divergence.
- Score: 34.93395633215398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning invariant representations is a critical first step in a number of
machine learning tasks. A common approach corresponds to the so-called
information bottleneck principle in which an application dependent function of
mutual information is carefully chosen and optimized. Unfortunately, in
practice, these functions are not suitable for optimization purposes since
these losses are agnostic of the metric structure of the parameters of the
model. We introduce a class of losses for learning representations that are
invariant to some extraneous variable of interest by inverting the class of
contrastive losses, i.e., inverse contrastive loss (ICL). We show that if the
extraneous variable is binary, then optimizing ICL is equivalent to optimizing
a regularized MMD divergence. More generally, we also show that if we are
provided a metric on the sample space, our formulation of ICL can be decomposed
into a sum of convex functions of the given distance metric. Our experimental
results indicate that models obtained by optimizing ICL achieve significantly
better invariance to the extraneous variable for a fixed desired level of
accuracy. In a variety of experimental settings, we show applicability of ICL
for learning invariant representations for both continuous and discrete
extraneous variables.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Disentanglement with Factor Quantized Variational Autoencoders [11.086500036180222]
We propose a discrete variational autoencoder (VAE) based model where the ground truth information about the generative factors are not provided to the model.
We demonstrate the advantages of learning discrete representations over learning continuous representations in facilitating disentanglement.
Our method called FactorQVAE is the first method that combines optimization based disentanglement approaches with discrete representation learning.
arXiv Detail & Related papers (2024-09-23T09:33:53Z) - Quasi-parametric rates for Sparse Multivariate Functional Principal
Components Analysis [0.0]
We show that the eigenelements can be expressed as the solution to an optimization problem.
We establish a minimax lower bound on the mean square reconstruction error of the eigenelement, which proves that the procedure has an optimal variance in the minimax sense.
arXiv Detail & Related papers (2022-12-19T13:17:57Z) - Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral
Mapping for Single-channel Speech Enhancement [20.823177372464414]
Most speech enhancement (SE) models learn a point estimate, and do not make use of uncertainty estimation in the learning process.
We show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost.
arXiv Detail & Related papers (2022-11-16T02:29:05Z) - PAC Generalization via Invariant Representations [41.02828564338047]
We consider the notion of $epsilon$-approximate invariance in a finite sample setting.
Inspired by PAC learning, we obtain finite-sample out-of-distribution generalization guarantees.
Our results show bounds that do not scale in ambient dimension when intervention sites are restricted to lie in a constant size subset of in-degree bounded nodes.
arXiv Detail & Related papers (2022-05-30T15:50:14Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Adaptive neighborhood Metric learning [184.95321334661898]
We propose a novel distance metric learning algorithm, named adaptive neighborhood metric learning (ANML)
ANML can be used to learn both the linear and deep embeddings.
The emphlog-exp mean function proposed in our method gives a new perspective to review the deep metric learning methods.
arXiv Detail & Related papers (2022-01-20T17:26:37Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Convex Representation Learning for Generalized Invariance in
Semi-Inner-Product Space [32.442549424823355]
In this work we develop an algorithm for a variety of generalized representations in a semi-norms that representers in a lead, and bounds are established.
This allows in representations to be learned efficiently and effectively as confirmed in our experiments along with accurate predictions.
arXiv Detail & Related papers (2020-04-25T18:54:37Z) - Plannable Approximations to MDP Homomorphisms: Equivariance under
Actions [72.30921397899684]
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
We prove that when our loss is zero, we have a homomorphism of a deterministic Markov Decision Process.
We show experimentally that for deterministic MDPs, the optimal policy in the abstract MDP can be successfully lifted to the original MDP.
arXiv Detail & Related papers (2020-02-27T08:29:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.