Related papers: Hyperbolic Convolutional Neural Networks

Hyperbolic Convolutional Neural Networks

URL: http://arxiv.org/abs/2308.15639v1
Date: Tue, 29 Aug 2023 21:20:16 GMT
Title: Hyperbolic Convolutional Neural Networks
Authors: Andrii Skliar, Maurice Weiler
Abstract summary: Using non-Euclidean space for embedding data might result in more robust and explainable models. We hypothesize that ability of hyperbolic space to capture hierarchy in the data would lead to better performance.
Score: 14.35618845900589
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Learning is mostly responsible for the surge of interest in Artificial Intelligence in the last decade. So far, deep learning researchers have been particularly successful in the domain of image processing, where Convolutional Neural Networks are used. Although excelling at image classification, Convolutional Neural Networks are quite naive in that no inductive bias is set on the embedding space for images. Similar flaws are also exhibited by another type of Convolutional Networks - Graph Convolutional Neural Networks. However, using non-Euclidean space for embedding data might result in more robust and explainable models. One example of such a non-Euclidean space is hyperbolic space. Hyperbolic spaces are particularly useful due to their ability to fit more data in a low-dimensional space and tree-likeliness properties. These attractive properties have been previously used in multiple papers which indicated that they are beneficial for building hierarchical embeddings using shallow models and, recently, using MLPs and RNNs. However, no papers have yet suggested a general approach to using Hyperbolic Convolutional Neural Networks for structured data processing, although these are the most common examples of data used. Therefore, the goal of this work is to devise a general recipe for building Hyperbolic Convolutional Neural Networks. We hypothesize that ability of hyperbolic space to capture hierarchy in the data would lead to better performance. This ability should be particularly useful in cases where data has a tree-like structure. Since this is the case for many existing datasets \citep{wordnet, imagenet, fb15k}, we argue that such a model would be advantageous both in terms of applications and future research prospects.

Related papers

NN-Former: Rethinking Graph Structure in Neural Architecture Representation [67.3378579108611]
Graph Neural Networks (GNNs) and transformers have shown promising performance in representing neural architectures.<n>We show that sibling nodes are pivotal while overlooked in previous research.<n>Our approach consistently achieves promising performance in both accuracy and latency prediction.
arXiv Detail & Related papers (2025-07-01T15:46:18Z)
On the Universal Statistical Consistency of Expansive Hyperbolic Deep Convolutional Neural Networks [14.904264782690639]
In this work, we propose Hyperbolic DCNN based on the Poincar'e Disc. We offer extensive theoretical insights pertaining to the universal consistency of the expansive convolution in the hyperbolic space. Results reveal that the hyperbolic convolutional architecture outperforms the Euclidean ones by a commendable margin.
arXiv Detail & Related papers (2024-11-15T12:01:03Z)
Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet) ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z)
Dynamical systems' based neural networks [0.7874708385247353]
We build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks.
arXiv Detail & Related papers (2022-10-05T16:30:35Z)
Convolutional Neural Networks on Manifolds: From Graphs and Back [122.06927400759021]
We propose a manifold neural network (MNN) composed of a bank of manifold convolutional filters and point-wise nonlinearities. To sum up, we focus on the manifold model as the limit of large graphs and construct MNNs, while we can still bring back graph neural networks by the discretization of MNNs.
arXiv Detail & Related papers (2022-10-01T21:17:39Z)
Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models. We propose a realistic medical diagnosis benchmark for tabular transfer learning. We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
An Overview of Neural Network Compression [2.550900579709111]
In recent years there has been a resurgence in model compression techniques, particularly for deep convolutional neural networks and self-attention based networks such as the Transformer. This paper provides a timely overview of both old and current compression techniques for deep neural networks, including pruning, quantization, tensor decomposition, knowledge distillation and combinations thereof.
arXiv Detail & Related papers (2020-06-05T20:28:56Z)
Localized convolutional neural networks for geospatial wind forecasting [0.0]
Convolutional Neural Networks (CNN) possess positive qualities when it comes to many spatial data. In this work, we propose localized convolutional neural networks that enable CNNs to learn local features in addition to the global ones. They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed.
arXiv Detail & Related papers (2020-05-12T17:14:49Z)
A Convolutional Neural Network into graph space [5.6326241162252755]
We propose a new convolution neural network architecture, defined directly into graph space. We show its usability in a back-propagation context. It shows robustness with respect to graph domain changes and improvement with respect to other euclidean and non-euclidean convolutional architectures.
arXiv Detail & Related papers (2020-02-20T15:14:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.