Related papers: Conditional Information Gain Trellis

Conditional Information Gain Trellis

URL: http://arxiv.org/abs/2402.08345v2
Date: Mon, 8 Jul 2024 14:18:44 GMT
Title: Conditional Information Gain Trellis
Authors: Ufuk Can Bicici, Tuna Han Salih Meral, Lale Akarun,
Abstract summary: Conditional computing processes an input using only part of the neural network's computational units. We use a Trellis-based approach for generating specific execution paths in a deep convolutional neural network. We show that our conditional execution mechanism achieves comparable or better model performance compared to unconditional baselines.
Score: 1.290382979353427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conditional computing processes an input using only part of the neural network's computational units. Learning to execute parts of a deep convolutional network by routing individual samples has several advantages: Reducing the computational burden is an obvious advantage. Furthermore, if similar classes are routed to the same path, that part of the network learns to discriminate between finer differences and better classification accuracies can be attained with fewer parameters. Recently, several papers have exploited this idea to take a particular child of a node in a tree-shaped network or to skip parts of a network. In this work, we follow a Trellis-based approach for generating specific execution paths in a deep convolutional neural network. We have designed routing mechanisms that use differentiable information gain-based cost functions to determine which subset of features in a convolutional layer will be executed. We call our method Conditional Information Gain Trellis (CIGT). We show that our conditional execution mechanism achieves comparable or better model performance compared to unconditional baselines, using only a fraction of the computational resources.

Related papers

Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Neural Network Pruning Through Constrained Reinforcement Learning [3.2880869992413246]
We propose a general methodology for pruning neural networks. Our proposed methodology can prune neural networks to respect pre-defined computational budgets. We prove the effectiveness of our approach via comparison with state-of-the-art methods on standard image classification datasets.
arXiv Detail & Related papers (2021-10-16T11:57:38Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
Inductive Graph Embeddings through Locality Encodings [0.42970700836450487]
We look at the problem of finding inductive network embeddings in large networks without domain-dependent node/edge attributes. We propose to use a set of basic predefined local encodings as the basis of a learning algorithm. This method achieves state-of-the-art performance in tasks such as role detection, link prediction and node classification.
arXiv Detail & Related papers (2020-09-26T13:09:11Z)
ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet) Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms. Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z)
Online Sequential Extreme Learning Machines: Features Combined From Hundreds of Midlayers [0.0]
In this paper, we develop an algorithm called hierarchal online sequential learning algorithm (H-OS-ELM) The algorithm can learn chunk by chunk with fixed or varying block size.
arXiv Detail & Related papers (2020-06-12T00:50:04Z)
Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations [5.575293536755126]
We propose a novel neural network framework, finite difference neural networks (FDNet), to learn partial differential equations from data. Specifically, our proposed finite difference inspired network is designed to learn the underlying governing partial differential equations from trajectory data.
arXiv Detail & Related papers (2020-06-02T19:17:58Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.