Related papers: Learning Compositional Structures for Deep Learning: Why Routing-by-agreement is Necessary

Learning Compositional Structures for Deep Learning: Why Routing-by-agreement is Necessary

URL: http://arxiv.org/abs/2010.01488v2
Date: Tue, 6 Oct 2020 04:45:11 GMT
Title: Learning Compositional Structures for Deep Learning: Why Routing-by-agreement is Necessary
Authors: Sai Raam Venkatraman, Ankit Anand, S. Balasubramanian, R. Raghunatha Sarma
Abstract summary: We present a formal grammar description of convolutional neural networks and capsule networks. We show that routing is an important part of capsule networks -- effectively answering recent work that has questioned its necessity.
Score: 4.10184810111551
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A formal description of the compositionality of neural networks is associated directly with the formal grammar-structure of the objects it seeks to represent. This formal grammar-structure specifies the kind of components that make up an object, and also the configurations they are allowed to be in. In other words, objects can be described as a parse-tree of its components -- a structure that can be seen as a candidate for building connection-patterns among neurons in neural networks. We present a formal grammar description of convolutional neural networks and capsule networks that shows how capsule networks can enforce such parse-tree structures, while CNNs do not. Specifically, we show that the entropy of routing coefficients in the dynamic routing algorithm controls this ability. Thus, we introduce the entropy of routing weights as a loss function for better compositionality among capsules. We show by experiments, on data with a compositional structure, that the use of this loss enables capsule networks to better detect changes in compositionality. Our experiments show that as the entropy of the routing weights increases, the ability to detect changes in compositionality reduces. We see that, without routing, capsule networks perform similar to convolutional neural networks in that both these models perform badly at detecting changes in compositionality. Our results indicate that routing is an important part of capsule networks -- effectively answering recent work that has questioned its necessity. We also, by experiments on SmallNORB, CIFAR-10, and FashionMNIST, show that this loss keeps the accuracy of capsule network models comparable to models that do not use it .

Related papers

The Cooperative Network Architecture: Learning Structured Networks as Representation of Sensory Patterns [3.9848584845601014]
We introduce the Cooperative Network Architecture (CNA), a model that represents sensory signals using structured, recurrently connected networks of neurons, termed "nets" We demonstrate that net fragments can be learned without supervision and flexibly recombined to encode novel patterns, enabling figure completion and resilience to noise.
arXiv Detail & Related papers (2024-07-08T06:22:10Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
Credit Assignment for Trained Neural Networks Based on Koopman Operator Theory [3.130109807128472]
Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs. This paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks. Experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-02T06:34:27Z)
Robustcaps: a transformation-robust capsule network for image classification [6.445605125467574]
We present a deep neural network model that exhibits the desirable property of transformation-robustness. Our model, termed RobustCaps, uses group-equivariant convolutions in an improved capsule network model. It achieves state-of-the-art accuracies on CIFAR-10, FashionMNIST, and CIFAR-100 datasets.
arXiv Detail & Related papers (2022-10-20T08:42:33Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Learning low-rank latent mesoscale structures in networks [1.1470070927586016]
We present a new approach for describing low-rank mesoscale structures in networks. We use several synthetic network models and empirical friendship, collaboration, and protein--protein interaction (PPI) networks. We show how to denoise a corrupted network by using only the latent motifs that one learns directly from the corrupted network.
arXiv Detail & Related papers (2021-02-13T18:54:49Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks. We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z)
Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
Investigating the Compositional Structure Of Deep Neural Networks [1.8899300124593645]
We introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions. It is possible to characterize the instances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions. Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network.
arXiv Detail & Related papers (2020-02-17T14:16:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.