A Closer Look at Branch Classifiers of Multi-exit Architectures
- URL: http://arxiv.org/abs/2204.13347v1
- Date: Thu, 28 Apr 2022 08:37:25 GMT
- Title: A Closer Look at Branch Classifiers of Multi-exit Architectures
- Authors: Shaohui Lin, Bo Ji, Rongrong Ji, Angela Yao
- Abstract summary: Constant-complexity branching keeps all branches the same, while complexity-increasing and complexity-decreasing branching place more complex branches later or earlier in the backbone respectively.
We investigate a cause by using knowledge consistency to probe the effect of adding branches onto a backbone.
Our findings show that complexity-decreasing branching yields the least disruption to the feature abstraction hierarchy of the backbone.
- Score: 103.27533521196817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-exit architectures consist of a backbone and branch classifiers that
offer shortened inference pathways to reduce the run-time of deep neural
networks. In this paper, we analyze different branching patterns that vary in
their allocation of computational complexity for the branch classifiers.
Constant-complexity branching keeps all branches the same, while
complexity-increasing and complexity-decreasing branching place more complex
branches later or earlier in the backbone respectively. Through extensive
experimentation on multiple backbones and datasets, we find that
complexity-decreasing branches are more effective than constant-complexity or
complexity-increasing branches, which achieve the best accuracy-cost trade-off.
We investigate a cause by using knowledge consistency to probe the effect of
adding branches onto a backbone. Our findings show that complexity-decreasing
branching yields the least disruption to the feature abstraction hierarchy of
the backbone, which explains the effectiveness of the branching patterns.
Related papers
- Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution [20.103367702014474]
We propose a new low-cost ensemble learning to achieve high efficiency and classification performance.
For training, we employ knowledge distillation using the ensemble of the outputs as the teacher signal.
Experimental results show that our method achieves state-of-the-art classification accuracy and higher uncertainty estimation performance.
arXiv Detail & Related papers (2024-08-05T08:36:13Z) - TreeDQN: Learning to minimize Branch-and-Bound tree [78.52895577861327]
Branch-and-Bound is a convenient approach to solving optimization tasks in the form of Mixed Linear Programs.
The efficiency of the solver depends on the branchning used to select a variable for splitting.
We propose a reinforcement learning method that can efficiently learn the branching.
arXiv Detail & Related papers (2023-06-09T14:01:26Z) - Improving Out-of-Distribution Generalization of Neural Rerankers with
Contextualized Late Interaction [52.63663547523033]
Late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [] vector to compute the similarity score.
We show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures.
arXiv Detail & Related papers (2023-02-13T18:42:17Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Learning to branch with Tree MDPs [6.754135838894833]
We propose to learn branching rules from scratch via Reinforcement Learning (RL)
We propose tree Markov Decision Processes, or tree MDPs, a generalization of temporal MDPs that provides a more suitable framework for learning to branch.
We demonstrate through computational experiments that tree MDPs improve the learning convergence, and offer a promising framework for tackling the learning-to-branch problem in MILPs.
arXiv Detail & Related papers (2022-05-23T07:57:32Z) - Structural Analysis of Branch-and-Cut and the Learnability of Gomory
Mixed Integer Cuts [88.94020638263467]
The incorporation of cutting planes within the branch-and-bound algorithm, known as branch-and-cut, forms the backbone of modern integer programming solvers.
We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program.
Our main application of this analysis is to derive sample complexity guarantees for using machine learning to determine which cutting planes to apply during branch-and-cut.
arXiv Detail & Related papers (2022-04-15T03:32:40Z) - Towards Bi-directional Skip Connections in Encoder-Decoder Architectures
and Beyond [95.46272735589648]
We propose backward skip connections that bring decoded features back to the encoder.
Our design can be jointly adopted with forward skip connections in any encoder-decoder architecture.
We propose a novel two-phase Neural Architecture Search (NAS) algorithm, namely BiX-NAS, to search for the best multi-scale skip connections.
arXiv Detail & Related papers (2022-03-11T01:38:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.