Optimizing Curvature Learning for Robust Hyperbolic Deep Learning in Computer Vision
- URL: http://arxiv.org/abs/2405.13979v1
- Date: Wed, 22 May 2024 20:30:14 GMT
- Title: Optimizing Curvature Learning for Robust Hyperbolic Deep Learning in Computer Vision
- Authors: Ahmad Bdeir, Niels Landwehr,
- Abstract summary: We introduce an improved schema for popular learning algorithms and a novel normalization approach to constrain embeddings within the variable representative radius of the manifold.
Our approach demonstrates consistent performance improvements across both direct classification and hierarchical metric learning tasks while allowing for larger hyperbolic models.
- Score: 3.3964154468907486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperbolic deep learning has become a growing research direction in computer vision for the unique properties afforded by the alternate embedding space. The negative curvature and exponentially growing distance metric provide a natural framework for capturing hierarchical relationships between datapoints and allowing for finer separability between their embeddings. However, these methods are still computationally expensive and prone to instability, especially when attempting to learn the negative curvature that best suits the task and the data. Current Riemannian optimizers do not account for changes in the manifold which greatly harms performance and forces lower learning rates to minimize projection errors. Our paper focuses on curvature learning by introducing an improved schema for popular learning algorithms and providing a novel normalization approach to constrain embeddings within the variable representative radius of the manifold. Additionally, we introduce a novel formulation for Riemannian AdamW, and alternative hybrid encoder techniques and foundational formulations for current convolutional hyperbolic operations, greatly reducing the computational penalty of the hyperbolic embedding space. Our approach demonstrates consistent performance improvements across both direct classification and hierarchical metric learning tasks while allowing for larger hyperbolic models.
Related papers
- Nonparametric Linear Feature Learning in Regression Through Regularisation [0.0]
We propose a novel method for joint linear feature learning and non-parametric function estimation.
By using alternative minimisation, we iteratively rotate the data to improve alignment with leading directions.
We establish that the expected risk of our method converges to the minimal risk under minimal assumptions and with explicit rates.
arXiv Detail & Related papers (2023-07-24T12:52:55Z) - Accelerated Linearized Laplace Approximation for Bayesian Deep Learning [34.81292720605279]
We develop a Nystrom approximation to neural tangent kernels (NTKs) to accelerate LLA.
Our method benefits from the capability of popular deep learning libraries for forward mode automatic differentiation.
Our method can even scale up to architectures like vision transformers.
arXiv Detail & Related papers (2022-10-23T07:49:03Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Towards Scalable Hyperbolic Neural Networks using Taylor Series
Approximations [10.056167107654089]
Hyperbolic networks have shown prominent improvements over their Euclidean counterparts in several areas involving hierarchical datasets.
Their adoption in practice remains restricted due to (i) non-scalability on accelerated deep learning hardware, (ii) vanishing due to the closure of hyperbolic space, and (iii) information loss.
We propose the approximation of hyperbolic operators using Taylor series expansions, which allows us to reformulate the tangent gradients of hyperbolic functions into their equivariants.
arXiv Detail & Related papers (2022-06-07T22:31:17Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Adaptive Learning Rate and Momentum for Training Deep Neural Networks [0.0]
We develop a fast training method motivated by the nonlinear Conjugate Gradient (CG) framework.
Experiments in image classification datasets show that our method yields faster convergence than other local solvers.
arXiv Detail & Related papers (2021-06-22T05:06:56Z) - Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem.
CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint.
It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z) - Level-Set Curvature Neural Networks: A Hybrid Approach [0.0]
We present a hybrid strategy based on deep learning to compute mean curvature in the level-set method.
The proposed inference system combines a dictionary of improved regression models with standard numerical schemes to estimate curvature more accurately.
Our findings confirm that machine learning is a promising venue for devising viable solutions to the level-set method's numerical shortcomings.
arXiv Detail & Related papers (2021-04-07T06:51:52Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.