Related papers: Eccentric Regularization: Minimizing Hyperspherical Energy without explicit projection

Eccentric Regularization: Minimizing Hyperspherical Energy without explicit projection

URL: http://arxiv.org/abs/2104.11610v1
Date: Fri, 23 Apr 2021 13:55:17 GMT
Title: Eccentric Regularization: Minimizing Hyperspherical Energy without explicit projection
Authors: Xuefeng Li and Alan Blair
Abstract summary: We introduce a novel regularizing loss function which simulates a pairwise repulsive force between items. We show that minimizing this loss function in isolation achieves a hyperspherical distribution. We apply this method of Eccentric Regularization to an autoencoder, and demonstrate its effectiveness in image generation, representation learning and downstream classification tasks.
Score: 0.913755431537592
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several regularization methods have recently been introduced which force the latent activations of an autoencoder or deep neural network to conform to either a Gaussian or hyperspherical distribution, or to minimize the implicit rank of the distribution in latent space. In the present work, we introduce a novel regularizing loss function which simulates a pairwise repulsive force between items and an attractive force of each item toward the origin. We show that minimizing this loss function in isolation achieves a hyperspherical distribution. Moreover, when used as a regularizing term, the scaling factor can be adjusted to allow greater flexibility and tolerance of eccentricity, thus allowing the latent variables to be stratified according to their relative importance, while still promoting diversity. We apply this method of Eccentric Regularization to an autoencoder, and demonstrate its effectiveness in image generation, representation learning and downstream classification tasks.

Related papers

Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution [0.0]
We propose a novel variational autoencoder (VAE) architecture that employs a spherical Cauchy (spCauchy) latent distribution.<n>Unlike traditional Gaussian latent spaces or the widely used von Mises-Fisher (vMF) distribution, spCauchy provides a more natural hyperspherical representation of latent variables.
arXiv Detail & Related papers (2025-06-26T14:01:51Z)
On the Importance of Gaussianizing Representations [3.6919724596215615]
We present a novel normalization layer which encourages normality in the feature representations of neural networks using the power transform and employs additive Gaussian noise during training. Our experiments demonstrate the effectiveness of normality normalization, in regards to its generalization performance on an array of widely used model and dataset combinations.
arXiv Detail & Related papers (2025-05-01T17:47:44Z)
Measuring Heterogeneity in Machine Learning with Distributed Energy Distance [3.8318398579197335]
We introduce energy distance as a sensitive measure for quantifying distributional discrepancies. We develop Taylor approximations that preserve key theoretical quantitative properties while reducing computational overhead. We propose a novel application of energy distance to assign penalty weights for aligning predictions across heterogeneous nodes.
arXiv Detail & Related papers (2025-01-27T16:15:57Z)
Disentangled Interleaving Variational Encoding [1.132458063021286]
We propose a principled approach to disentangle the original input into marginal and conditional probability distributions in the latent space of a variational autoencoder. Our proposed model, Deep Disentangled Interleaving Variational. coder (DeepDIVE), learns disentangled features from the original input to form clusters in the embedding space. Experiments on two public datasets show that DeepDIVE disentangles the original input and yields forecast accuracies better than the original VAE.
arXiv Detail & Related papers (2025-01-15T10:50:54Z)
An Information-Theoretic Regularizer for Lossy Neural Image Compression [20.939331919455935]
Lossy image compression networks aim to minimize the latent entropy of images while adhering to specific distortion constraints. We propose a novel structural regularization method for the neural image compression task by incorporating the negative conditional source entropy into the training objective.
arXiv Detail & Related papers (2024-11-23T05:19:27Z)
Regularization for Adversarial Robust Learning [18.46110328123008]
We develop a novel approach to adversarial training that integrates $phi$-divergence regularization into the distributionally robust risk function. This regularization brings a notable improvement in computation compared with the original formulation. We validate our proposed method in supervised learning, reinforcement learning, and contextual learning and showcase its state-of-the-art performance against various adversarial attacks.
arXiv Detail & Related papers (2024-08-19T03:15:41Z)
Latent Point Collapse on a Low Dimensional Embedding in Deep Neural Network Classifiers [0.0]
We propose a method to induce the collapse of latent representations belonging to the same class into a single point.<n>The proposed approach is straightforward to implement and yields substantial improvements in discnative feature embeddings.
arXiv Detail & Related papers (2023-10-12T11:16:57Z)
Low-Rank Tensor Completion via Novel Sparsity-Inducing Regularizers [30.920908325825668]
To alleviate l1-norm in the low-rank tensor completion problem, non-rank surrogates/regularizers have been suggested. These regularizers are applied to nuclear-rank restoration, and efficient algorithms based on the method of multipliers are proposed.
arXiv Detail & Related papers (2023-10-10T01:00:13Z)
Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding. The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform. Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
Nonasymptotic theory for two-layer neural networks: Beyond the bias-variance trade-off [10.182922771556742]
We present a nonasymptotic generalization theory for two-layer neural networks with ReLU activation function. We show that overparametrized random feature models suffer from the curse of dimensionality and thus are suboptimal.
arXiv Detail & Related papers (2021-06-09T03:52:18Z)
Hyperspherically Regularized Networks for BYOL Improves Feature Uniformity and Separability [4.822598110892847]
bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm. This work empirically demonstrates that feature diversity enforced by contrastive losses is beneficial when employed in BYOL.
arXiv Detail & Related papers (2021-04-29T18:57:27Z)
Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions. We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z)
Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows. We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z)
Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences. FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.