Related papers: Free Hyperbolic Neural Networks with Limited Radii

Free Hyperbolic Neural Networks with Limited Radii

URL: http://arxiv.org/abs/2107.11472v1
Date: Fri, 23 Jul 2021 22:10:16 GMT
Title: Free Hyperbolic Neural Networks with Limited Radii
Authors: Yunhui Guo and Xudong Wang and Yubei Chen and Stella X. Yu
Abstract summary: Hyperbolic Neural Networks (HNNs) that operate directly in hyperbolic space have been proposed recently to further exploit the potential of hyperbolic representations. While HNNs have achieved better performance than Euclidean neural networks (ENNs) on datasets with implicit hierarchical structure, they still perform poorly on standard classification benchmarks such as CIFAR and ImageNet. In this paper, we first conduct an empirical study showing that the inferior performance of HNNs on standard recognition datasets can be attributed to the notorious vanishing gradient problem. Our analysis leads to a simple yet effective solution called Feature Clipping, which regularizes the hyperbolic embedding whenever its
Score: 32.42488915688723
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Non-Euclidean geometry with constant negative curvature, i.e., hyperbolic space, has attracted sustained attention in the community of machine learning. Hyperbolic space, owing to its ability to embed hierarchical structures continuously with low distortion, has been applied for learning data with tree-like structures. Hyperbolic Neural Networks (HNNs) that operate directly in hyperbolic space have also been proposed recently to further exploit the potential of hyperbolic representations. While HNNs have achieved better performance than Euclidean neural networks (ENNs) on datasets with implicit hierarchical structure, they still perform poorly on standard classification benchmarks such as CIFAR and ImageNet. The traditional wisdom is that it is critical for the data to respect the hyperbolic geometry when applying HNNs. In this paper, we first conduct an empirical study showing that the inferior performance of HNNs on standard recognition datasets can be attributed to the notorious vanishing gradient problem. We further discovered that this problem stems from the hybrid architecture of HNNs. Our analysis leads to a simple yet effective solution called Feature Clipping, which regularizes the hyperbolic embedding whenever its norm exceeding a given threshold. Our thorough experiments show that the proposed method can successfully avoid the vanishing gradient problem when training HNNs with backpropagation. The improved HNNs are able to achieve comparable performance with ENNs on standard image recognition datasets including MNIST, CIFAR10, CIFAR100 and ImageNet, while demonstrating more adversarial robustness and stronger out-of-distribution detection capability.

Related papers

Lorentzian Residual Neural Networks [15.257990326035694]
We introduce LResNet, a novel Lorentzian residual neural network based on the weighted Lorentzian centroid in the Lorentz model of hyperbolic geometry. Our method enables the efficient integration of residual connections in hyperbolic neural networks while preserving their hierarchical representation capabilities. Our findings highlight the potential of LResNet for building more expressive neural networks in hyperbolic embedding space.
arXiv Detail & Related papers (2024-12-19T09:56:01Z)
Implicit Graph Neural Diffusion Networks: Convergence, Generalization, and Over-Smoothing [7.984586585987328]
Implicit Graph Neural Networks (GNNs) have achieved significant success in addressing graph learning problems. We introduce a geometric framework for designing implicit graph diffusion layers based on a parameterized graph Laplacian operator. We show how implicit GNN layers can be viewed as the fixed-point equation of a Dirichlet energy minimization problem.
arXiv Detail & Related papers (2023-08-07T05:22:33Z)
Fully Hyperbolic Convolutional Neural Networks for Computer Vision [3.3964154468907486]
We present HCNN, a fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression. Experiments on standard vision tasks demonstrate the promising performance of our HCNN framework in both hybrid and fully hyperbolic settings.
arXiv Detail & Related papers (2023-03-28T12:20:52Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Learning A 3D-CNN and Transformer Prior for Hyperspectral Image Super-Resolution [80.93870349019332]
We propose a novel HSISR method that uses Transformer instead of CNN to learn the prior of HSIs. Specifically, we first use the gradient algorithm to solve the HSISR model, and then use an unfolding network to simulate the iterative solution processes.
arXiv Detail & Related papers (2021-11-27T15:38:57Z)
ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network [72.16255675586089]
We propose an Adaptive Curvature Exploration Hyperbolic Graph NeuralNetwork named ACE-HGNN to adaptively learn the optimal curvature according to the input graph and downstream tasks. Experiments on multiple real-world graph datasets demonstrate a significant and consistent performance improvement in model quality with competitive performance and good generalization ability.
arXiv Detail & Related papers (2021-10-15T07:18:57Z)
Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study [100.27567794045045]
Training deep graph neural networks (GNNs) is notoriously hard. We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
arXiv Detail & Related papers (2021-08-24T05:00:37Z)
Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by Design [2.752441514346229]
Vanishing and exploding gradients during weight optimization through backpropagation can be difficult to train. We propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems. Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth. The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.
arXiv Detail & Related papers (2021-05-27T14:52:22Z)
A unified framework for Hamiltonian deep neural networks [3.0934684265555052]
Training deep neural networks (DNNs) can be difficult due to vanishing/exploding gradients during weight optimization. We propose a class of DNNs stemming from the time discretization of Hamiltonian systems. The proposed Hamiltonian framework, besides encompassing existing networks inspired by marginally stable ODEs, allows one to derive new and more expressive architectures.
arXiv Detail & Related papers (2021-04-27T13:20:24Z)
Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks. We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs) At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.