Related papers: A unified view for unsupervised representation learning with density ratio estimation: Maximization of mutual information, nonlinear ICA and nonlinear subspace estimation

A unified view for unsupervised representation learning with density ratio estimation: Maximization of mutual information, nonlinear ICA and nonlinear subspace estimation

URL: http://arxiv.org/abs/2101.02083v1
Date: Wed, 6 Jan 2021 15:08:54 GMT
Title: A unified view for unsupervised representation learning with density ratio estimation: Maximization of mutual information, nonlinear ICA and nonlinear subspace estimation
Authors: Hiroaki Sasaki and Takashi Takenouchi
Abstract summary: This paper emphasizes that density ratio estimation is a promising goal for unsupervised representation learning. We show that density ratio estimation unifies three frameworks for unsupervised representation learning: Maximization of mutual information (MI), nonlinear independent component analysis (ICA) and a novel framework for estimation of a lower-dimensional nonlinear subspace.
Score: 3.35805121090065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised representation learning is one of the most important problems in machine learning. Recent promising methods are based on contrastive learning. However, contrastive learning often relies on heuristic ideas, and therefore it is not easy to understand what contrastive learning is doing. This paper emphasizes that density ratio estimation is a promising goal for unsupervised representation learning, and promotes understanding to contrastive learning. Our primal contribution is to theoretically show that density ratio estimation unifies three frameworks for unsupervised representation learning: Maximization of mutual information (MI), nonlinear independent component analysis (ICA) and a novel framework for estimation of a lower-dimensional nonlinear subspace proposed in this paper. This unified view clarifies under what conditions contrastive learning can be regarded as maximizing MI, performing nonlinear ICA or estimating the lower-dimensional nonlinear subspace in the proposed framework. Furthermore, we also make theoretical contributions in each of the three frameworks: We show that MI can be maximized through density ratio estimation under certain conditions, while our analysis for nonlinear ICA reveals a novel insight for recovery of the latent source components, which is clearly supported by numerical experiments. In addition, some theoretical conditions are also established to estimate a nonlinear subspace in the proposed framework. Based on the unified view, we propose two practical methods for unsupervised representation learning through density ratio estimation: The first method is an outlier-robust method for representation learning, while the second one is a sample-efficient nonlinear ICA method. Finally, we numerically demonstrate usefulness of the proposed methods in nonlinear ICA and through application to a downstream task for classification.

Related papers

Interpretable Kernel Representation Learning at Scale: A Unified Framework Utilizing Nyström Approximation [6.209111342262837]
We introduce KREPES -- a unified framework for kernel-based representation learning via Nystr"om approximation.<n>KREPES accommodates a wide range of unsupervised and self-supervised losses.<n>It enables principled interpretability of the learned representations, an immediate benefit over deep models.
arXiv Detail & Related papers (2025-09-29T08:45:40Z)
Orthogonal Representation Learning for Estimating Causal Quantities [25.068617118126824]
We propose a class of Neyman-orthogonal learners for causal quantities defined at the representation level, which we call OR-learners. Our OR-learners have several practical advantages: they allow for consistent estimation of causal quantities based on any learned representation, while offering favorable theoretical properties including double robustness and quasi-oracle efficiency.
arXiv Detail & Related papers (2025-02-06T18:18:48Z)
Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z)
Towards Unsupervised Representation Learning: Learning, Evaluating and Transferring Visual Representations [1.8130068086063336]
We contribute to the field of unsupervised (visual) representation learning from three perspectives. We design unsupervised, backpropagation-free Convolutional Self-Organizing Neural Networks (CSNNs) We build upon the widely used (non-)linear evaluation protocol to define pretext- and target-objective-independent metrics. We contribute CARLANE, the first 3-way sim-to-real domain adaptation benchmark for 2D lane detection, and a method based on self-supervised learning.
arXiv Detail & Related papers (2023-11-30T15:57:55Z)
Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning. We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z)
Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations. We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning. We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z)
Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from Images [31.413588478694496]
We argue that explicitly improving Markovianity of the learned embedding is desirable. We propose a self-supervised representation learning method which integrates contrastive learning with dynamic models.
arXiv Detail & Related papers (2022-03-02T14:39:17Z)
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations [1.0609815608017066]
We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive. We introduce a novel bilinear saddle-point framework using Lagrangian duality to bridge the gap between theory and practice.
arXiv Detail & Related papers (2021-12-28T05:47:24Z)
A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning. Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension. Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z)
Which Mutual-Information Representation Learning Objectives are Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data. This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy. Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z)
Robust Unsupervised Learning via L-Statistic Minimization [38.49191945141759]
We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. We prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning.
arXiv Detail & Related papers (2020-12-14T10:36:06Z)
Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.