HOME: High-Order Mixed-Moment-based Embedding for Representation
Learning
- URL: http://arxiv.org/abs/2207.07743v1
- Date: Fri, 15 Jul 2022 20:34:49 GMT
- Title: HOME: High-Order Mixed-Moment-based Embedding for Representation
Learning
- Authors: Chuang Niu and Ge Wang
- Abstract summary: We propose the High-Order Mixed-Moment-based Embedding (HOME) strategy to reduce redundancy between any sets of feature variables.
Our initial experiments show that a simple version in the form of a three-order HOME scheme already significantly outperforms the current two-order baseline method.
- Score: 6.693379403133435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Minimum redundancy among different elements of an embedding in a latent space
is a fundamental requirement or major preference in representation learning to
capture intrinsic informational structures. Current self-supervised learning
methods minimize a pair-wise covariance matrix to reduce the feature redundancy
and produce promising results. However, such representation features of
multiple variables may contain the redundancy among more than two feature
variables that cannot be minimized via the pairwise regularization. Here we
propose the High-Order Mixed-Moment-based Embedding (HOME) strategy to reduce
the redundancy between any sets of feature variables, which is to our best
knowledge the first attempt to utilize high-order statistics/information in
this context. Multivariate mutual information is minimum if and only if
multiple variables are mutually independent, which suggests the necessary
conditions of factorized mixed moments among multiple variables. Based on these
statistical and information theoretic principles, our general HOME framework is
presented for self-supervised representation learning. Our initial experiments
show that a simple version in the form of a three-order HOME scheme already
significantly outperforms the current two-order baseline method (i.e., Barlow
Twins) in terms of the linear evaluation on representation features.
Related papers
- Differentiable Information Bottleneck for Deterministic Multi-view Clustering [9.723389925212567]
We propose a new differentiable information bottleneck (DIB) method, which provides a deterministic and analytical MVC solution.
Specifically, we first propose to directly fit the mutual information of high-dimensional spaces by leveraging normalized kernel Gram matrix.
Then, based on the new mutual information measurement, a deterministic multi-view neural network with analytical gradients is explicitly trained to parameterize IB principle.
arXiv Detail & Related papers (2024-03-23T02:13:22Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in
Neural Networks [66.76034024335833]
We investigate why diverse/ complex features are learned by the backbone, and their brittleness is due to the linear classification head relying primarily on the simplest features.
We propose Feature Reconstruction Regularizer (FRR) to ensure that the learned features can be reconstructed back from the logits.
We demonstrate up to 15% gains in OOD accuracy on the recently introduced semi-synthetic datasets with extreme distribution shifts.
arXiv Detail & Related papers (2022-10-04T04:01:15Z) - Tensor-based Multi-view Spectral Clustering via Shared Latent Space [14.470859959783995]
Multi-view Spectral Clustering (MvSC) attracts increasing attention due to diverse data sources.
New method for MvSC is proposed via a shared latent space from the Restricted Kernel Machine framework.
arXiv Detail & Related papers (2022-07-23T17:30:54Z) - Supervised Multivariate Learning with Simultaneous Feature Auto-grouping
and Dimension Reduction [7.093830786026851]
This paper proposes a novel clustered reduced-rank learning framework.
It imposes two joint matrix regularizations to automatically group the features in constructing predictive factors.
It is more interpretable than low-rank modeling and relaxes the stringent sparsity assumption in variable selection.
arXiv Detail & Related papers (2021-12-17T20:11:20Z) - Multi-view Orthonormalized Partial Least Squares: Regularizations and
Deep Extensions [8.846165479467324]
We establish a family of subspace-based learning method for multi-view learning using the least squares as the fundamental basis.
We propose a unified multi-view learning framework to learn a classifier over a common latent space shared by all views.
arXiv Detail & Related papers (2020-07-09T19:00:39Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z) - Learning Diverse and Discriminative Representations via the Principle of
Maximal Coding Rate Reduction [32.21975128854042]
We propose the principle of Maximal Coding Rate Reduction ($textMCR2$), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class.
We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive and contrastive learning, and provide theoretical guarantees for learning diverse and discriminative features.
arXiv Detail & Related papers (2020-06-15T17:23:55Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.