Gradient-based Competitive Learning: Theory
- URL: http://arxiv.org/abs/2009.02799v1
- Date: Sun, 6 Sep 2020 19:00:51 GMT
- Title: Gradient-based Competitive Learning: Theory
- Authors: Giansalvo Cirrincione, Pietro Barbiero, Gabriele Ciravegna, Vincenzo
Randazzo
- Abstract summary: This paper introduces a novel perspective in this area by combining gradient-based and competitive learning.
The theory is based on the intuition that neural networks are able to learn topological structures by working directly on the transpose of the input matrix.
The proposed approach has a great potential as it can be generalized to a vast selection of topological learning tasks.
- Score: 1.6752712949948443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has been widely used for supervised learning and
classification/regression problems. Recently, a novel area of research has
applied this paradigm to unsupervised tasks; indeed, a gradient-based approach
extracts, efficiently and autonomously, the relevant features for handling
input data. However, state-of-the-art techniques focus mostly on algorithmic
efficiency and accuracy rather than mimic the input manifold. On the contrary,
competitive learning is a powerful tool for replicating the input distribution
topology. This paper introduces a novel perspective in this area by combining
these two techniques: unsupervised gradient-based and competitive learning. The
theory is based on the intuition that neural networks are able to learn
topological structures by working directly on the transpose of the input
matrix. At this purpose, the vanilla competitive layer and its dual are
presented. The former is just an adaptation of a standard competitive layer for
deep clustering, while the latter is trained on the transposed matrix. Their
equivalence is extensively proven both theoretically and experimentally.
However, the dual layer is better suited for handling very high-dimensional
datasets. The proposed approach has a great potential as it can be generalized
to a vast selection of topological learning tasks, such as non-stationary and
hierarchical clustering; furthermore, it can also be integrated within more
complex architectures such as autoencoders and generative adversarial networks.
Related papers
- Semi-adaptive Synergetic Two-way Pseudoinverse Learning System [8.16000189123978]
We propose a semi-adaptive synergetic two-way pseudoinverse learning system.
Each subsystem encompasses forward learning, backward learning, and feature concatenation modules.
The whole system is trained using a non-gradient descent learning algorithm.
arXiv Detail & Related papers (2024-06-27T06:56:46Z) - Understanding Deep Representation Learning via Layerwise Feature
Compression and Discrimination [33.273226655730326]
We show that each layer of a deep linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate.
This is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.
arXiv Detail & Related papers (2023-11-06T09:00:38Z) - Modular Neural Network Approaches for Surgical Image Recognition [0.0]
We introduce and evaluate different architectures of modular learning for Dorsal Capsulo-Scapholunate Septum (DCSS) instability classification.
Our experiments have shown that modular learning improves performances compared to non-modular systems.
In the second part, we present our approach for data labeling and segmentation with self-training applied on shoulder arthroscopy images.
arXiv Detail & Related papers (2023-07-17T22:28:16Z) - Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - Hidden Classification Layers: Enhancing linear separability between
classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach.
We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Topological Gradient-based Competitive Learning [1.6752712949948443]
This work is to present a novel comprehensive theory aspiring at bridging competitive learning with gradient-based learning.
We fully demonstrate the theoretical equivalence of two novel gradient-based competitive layers.
Preliminary experiments show how the dual approach, trained on the transpose of the input matrix, lead to faster convergence rate and higher training accuracy both in low and high-dimensional scenarios.
arXiv Detail & Related papers (2020-08-21T13:44:38Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.