Related papers: A Closer Look at Branch Classifiers of Multi-exit Architectures

A Closer Look at Branch Classifiers of Multi-exit Architectures

URL: http://arxiv.org/abs/2204.13347v1
Date: Thu, 28 Apr 2022 08:37:25 GMT
Title: A Closer Look at Branch Classifiers of Multi-exit Architectures
Authors: Shaohui Lin, Bo Ji, Rongrong Ji, Angela Yao
Abstract summary: Constant-complexity branching keeps all branches the same, while complexity-increasing and complexity-decreasing branching place more complex branches later or earlier in the backbone respectively. We investigate a cause by using knowledge consistency to probe the effect of adding branches onto a backbone. Our findings show that complexity-decreasing branching yields the least disruption to the feature abstraction hierarchy of the backbone.
Score: 103.27533521196817
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-exit architectures consist of a backbone and branch classifiers that offer shortened inference pathways to reduce the run-time of deep neural networks. In this paper, we analyze different branching patterns that vary in their allocation of computational complexity for the branch classifiers. Constant-complexity branching keeps all branches the same, while complexity-increasing and complexity-decreasing branching place more complex branches later or earlier in the backbone respectively. Through extensive experimentation on multiple backbones and datasets, we find that complexity-decreasing branches are more effective than constant-complexity or complexity-increasing branches, which achieve the best accuracy-cost trade-off. We investigate a cause by using knowledge consistency to probe the effect of adding branches onto a backbone. Our findings show that complexity-decreasing branching yields the least disruption to the feature abstraction hierarchy of the backbone, which explains the effectiveness of the branching patterns.

Related papers

ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search [77.55575655986252]
ProtInvTree is a reward-guided tree-search framework for protein inverse folding.<n>It reformulates sequence generation as a deliberate, step-wise decision-making process.<n>It supports flexible test-time scaling by expanding the search depth and breadth without retraining.
arXiv Detail & Related papers (2025-06-01T09:34:20Z)
CORG: Generating Answers from Complex, Interrelated Contexts [57.213304718157985]
In a real-world corpus, knowledge frequently recurs across documents but often contains inconsistencies due to ambiguous naming, outdated information, or errors. Previous research has shown that language models struggle with these complexities, typically focusing on single factors in isolation. We introduce Context Organizer (CORG), a framework that organizes multiple contexts into independently processed groups.
arXiv Detail & Related papers (2025-04-25T02:40:48Z)
Optimal Decision Tree Pruning Revisited: Algorithms and Complexity [22.57063332430197]
We focus on fundamental pruning operations of subtree replacement and raising. While optimal pruning can be performed in time for subtree replacement, the problem is NP-complete for subtree raising. For example, while subtree raising is hard for small domain size, it can be solved in $D2d cdot |I|O(1)$ time, where $|I|$ is the input size.
arXiv Detail & Related papers (2025-03-05T15:02:46Z)
Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution [20.103367702014474]
We propose a new low-cost ensemble learning to achieve high efficiency and classification performance. For training, we employ knowledge distillation using the ensemble of the outputs as the teacher signal. Experimental results show that our method achieves state-of-the-art classification accuracy and higher uncertainty estimation performance.
arXiv Detail & Related papers (2024-08-05T08:36:13Z)
TreeDQN: Learning to minimize Branch-and-Bound tree [78.52895577861327]
Branch-and-Bound is a convenient approach to solving optimization tasks in the form of Mixed Linear Programs. The efficiency of the solver depends on the branchning used to select a variable for splitting. We propose a reinforcement learning method that can efficiently learn the branching.
arXiv Detail & Related papers (2023-06-09T14:01:26Z)
Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction [52.63663547523033]
Late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [] vector to compute the similarity score. We show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures.
arXiv Detail & Related papers (2023-02-13T18:42:17Z)
Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching. We propose retro branching; a simple yet effective approach to RL for branching. We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z)
Learning to branch with Tree MDPs [6.754135838894833]
We propose to learn branching rules from scratch via Reinforcement Learning (RL) We propose tree Markov Decision Processes, or tree MDPs, a generalization of temporal MDPs that provides a more suitable framework for learning to branch. We demonstrate through computational experiments that tree MDPs improve the learning convergence, and offer a promising framework for tackling the learning-to-branch problem in MILPs.
arXiv Detail & Related papers (2022-05-23T07:57:32Z)
Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts [88.94020638263467]
The incorporation of cutting planes within the branch-and-bound algorithm, known as branch-and-cut, forms the backbone of modern integer programming solvers. We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program. Our main application of this analysis is to derive sample complexity guarantees for using machine learning to determine which cutting planes to apply during branch-and-cut.
arXiv Detail & Related papers (2022-04-15T03:32:40Z)
Towards Bi-directional Skip Connections in Encoder-Decoder Architectures and Beyond [95.46272735589648]
We propose backward skip connections that bring decoded features back to the encoder. Our design can be jointly adopted with forward skip connections in any encoder-decoder architecture. We propose a novel two-phase Neural Architecture Search (NAS) algorithm, namely BiX-NAS, to search for the best multi-scale skip connections.
arXiv Detail & Related papers (2022-03-11T01:38:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.