Related papers: Optimizing Curvature Learning for Robust Hyperbolic Deep Learning in Computer Vision

Optimizing Curvature Learning for Robust Hyperbolic Deep Learning in Computer Vision

URL: http://arxiv.org/abs/2405.13979v1
Date: Wed, 22 May 2024 20:30:14 GMT
Title: Optimizing Curvature Learning for Robust Hyperbolic Deep Learning in Computer Vision
Authors: Ahmad Bdeir, Niels Landwehr,
Abstract summary: We introduce an improved schema for popular learning algorithms and a novel normalization approach to constrain embeddings within the variable representative radius of the manifold. Our approach demonstrates consistent performance improvements across both direct classification and hierarchical metric learning tasks while allowing for larger hyperbolic models.
Score: 3.3964154468907486
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hyperbolic deep learning has become a growing research direction in computer vision for the unique properties afforded by the alternate embedding space. The negative curvature and exponentially growing distance metric provide a natural framework for capturing hierarchical relationships between datapoints and allowing for finer separability between their embeddings. However, these methods are still computationally expensive and prone to instability, especially when attempting to learn the negative curvature that best suits the task and the data. Current Riemannian optimizers do not account for changes in the manifold which greatly harms performance and forces lower learning rates to minimize projection errors. Our paper focuses on curvature learning by introducing an improved schema for popular learning algorithms and providing a novel normalization approach to constrain embeddings within the variable representative radius of the manifold. Additionally, we introduce a novel formulation for Riemannian AdamW, and alternative hybrid encoder techniques and foundational formulations for current convolutional hyperbolic operations, greatly reducing the computational penalty of the hyperbolic embedding space. Our approach demonstrates consistent performance improvements across both direct classification and hierarchical metric learning tasks while allowing for larger hyperbolic models.

Related papers

Fast State-Augmented Learning for Wireless Resource Allocation with Dual Variable Regression [83.27791109672927]
We show how a state-augmented graph neural network (GNN) parametrization for the resource allocation policy circumvents the drawbacks of the ubiquitous dual subgradient methods.<n>Lagrangian maximizing state-augmented policies are learned during the offline training phase.<n>We prove a convergence result and an exponential probability bound on the excursions of the dual function (iterate) optimality gaps.
arXiv Detail & Related papers (2025-06-23T15:20:58Z)
Hyperbolic Dual Feature Augmentation for Open-Environment [41.23999800250096]
We propose a hyperbolic dual feature augmentation method for open-environment, which augments features for both seen and unseen classes in the hyperbolic space.<n>Our method effectively enhances the performance of hyperbolic algorithms in open-environment.
arXiv Detail & Related papers (2025-06-10T15:34:09Z)
HAM: A Hyperbolic Step to Regulate Implicit Bias [14.701241300621648]
We show that HAM (Hyperbolic Minimization) alternates between an overhead step and a new hyperbolic mirror step.<n>Ham's implicit bias consistently boosts performance--even of dense training.<n>Ham is especially effective in combination with different sparsification methods, improving upon the state of the art.
arXiv Detail & Related papers (2025-06-03T08:47:16Z)
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU [50.9588132578029]
This paper investigates machine unlearning in hyperbolic contrastive learning. We adapt Alignment to MERU, a model that embeds images and text in hyperbolic space to better capture semantic hierarchies. Our approach introduces hyperbolic-specific components including entailment calibration and norm regularization that leverage the unique properties of hyperbolic space.
arXiv Detail & Related papers (2025-03-19T12:47:37Z)
Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions. Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms. We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z)
Understanding Hyperbolic Metric Learning through Hard Negative Sampling [13.478667527129726]
We investigate the effects of integrating hyperbolic space into metric learning, particularly when training with contrastive loss. We benchmark the results of Vision Transformers (ViTs) using a hybrid objective function that combines loss from Euclidean and hyperbolic spaces. We also reveal that hyperbolic metric learning is highly related to hard negative sampling, providing insights for future work.
arXiv Detail & Related papers (2024-04-23T21:11:30Z)
Mitigating Over-Smoothing and Over-Squashing using Augmentations of Forman-Ricci Curvature [1.1126342180866644]
We propose a rewiring technique based on Augmented Forman-Ricci curvature (AFRC), a scalable curvature notation. We prove that AFRC effectively characterizes over-smoothing and over-squashing effects in message-passing GNNs.
arXiv Detail & Related papers (2023-09-17T21:43:18Z)
Nonparametric Linear Feature Learning in Regression Through Regularisation [0.0]
We propose a novel method for joint linear feature learning and non-parametric function estimation. By using alternative minimisation, we iteratively rotate the data to improve alignment with leading directions. We establish that the expected risk of our method converges to the minimal risk under minimal assumptions and with explicit rates.
arXiv Detail & Related papers (2023-07-24T12:52:55Z)
Accelerated Linearized Laplace Approximation for Bayesian Deep Learning [34.81292720605279]
We develop a Nystrom approximation to neural tangent kernels (NTKs) to accelerate LLA. Our method benefits from the capability of popular deep learning libraries for forward mode automatic differentiation. Our method can even scale up to architectures like vision transformers.
arXiv Detail & Related papers (2022-10-23T07:49:03Z)
Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks. We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z)
Towards Scalable Hyperbolic Neural Networks using Taylor Series Approximations [10.056167107654089]
Hyperbolic networks have shown prominent improvements over their Euclidean counterparts in several areas involving hierarchical datasets. Their adoption in practice remains restricted due to (i) non-scalability on accelerated deep learning hardware, (ii) vanishing due to the closure of hyperbolic space, and (iii) information loss. We propose the approximation of hyperbolic operators using Taylor series expansions, which allows us to reformulate the tangent gradients of hyperbolic functions into their equivariants.
arXiv Detail & Related papers (2022-06-07T22:31:17Z)
Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z)
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z)
Enhancing Hyperbolic Graph Embeddings via Contrastive Learning [7.901082408569372]
We propose a novel Hyperbolic Graph Contrastive Learning (HGCL) framework which learns node representations through multiple hyperbolic spaces. Experimental results on multiple real-world datasets demonstrate the superiority of the proposed HGCL.
arXiv Detail & Related papers (2022-01-21T06:10:05Z)
Adaptive Learning Rate and Momentum for Training Deep Neural Networks [0.0]
We develop a fast training method motivated by the nonlinear Conjugate Gradient (CG) framework. Experiments in image classification datasets show that our method yields faster convergence than other local solvers.
arXiv Detail & Related papers (2021-06-22T05:06:56Z)
Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem. CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint. It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z)
Level-Set Curvature Neural Networks: A Hybrid Approach [0.0]
We present a hybrid strategy based on deep learning to compute mean curvature in the level-set method. The proposed inference system combines a dictionary of improved regression models with standard numerical schemes to estimate curvature more accurately. Our findings confirm that machine learning is a promising venue for devising viable solutions to the level-set method's numerical shortcomings.
arXiv Detail & Related papers (2021-04-07T06:51:52Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.