Dynamic Perceiver for Efficient Visual Recognition
- URL: http://arxiv.org/abs/2306.11248v2
- Date: Sun, 13 Aug 2023 05:44:54 GMT
- Title: Dynamic Perceiver for Efficient Visual Recognition
- Authors: Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu,
Chao Deng, Junlan Feng, Shiji Song, Gao Huang
- Abstract summary: We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
- Score: 87.08210214417309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Early exiting has become a promising approach to improving the inference
efficiency of deep networks. By structuring models with multiple classifiers
(exits), predictions for ``easy'' samples can be generated at earlier exits,
negating the need for executing deeper layers. Current multi-exit networks
typically implement linear classifiers at intermediate layers, compelling
low-level features to encapsulate high-level semantics. This sub-optimal design
invariably undermines the performance of later exits. In this paper, we propose
Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure
and the early classification task with a novel dual-branch architecture. A
feature branch serves to extract image features, while a classification branch
processes a latent code assigned for classification tasks. Bi-directional
cross-attention layers are established to progressively fuse the information of
both branches. Early exits are placed exclusively within the classification
branch, thus eliminating the need for linear separability in low-level
features. Dyn-Perceiver constitutes a versatile and adaptable framework that
can be built upon various architectures. Experiments on image classification,
action recognition, and object detection demonstrate that our method
significantly improves the inference efficiency of different backbones,
outperforming numerous competitive approaches across a broad range of
computational budgets. Evaluation on both CPU and GPU platforms substantiate
the superior practical efficiency of Dyn-Perceiver. Code is available at
https://www.github.com/LeapLabTHU/Dynamic_Perceiver.
Related papers
- Automated Sizing and Training of Efficient Deep Autoencoders using
Second Order Algorithms [0.46040036610482665]
We propose a multi-step training method for generalized linear classifiers.
validation error is minimized by pruning of unnecessary inputs.
desired outputs are improved via a method similar to the Ho-Kashyap rule.
arXiv Detail & Related papers (2023-08-11T16:48:31Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - An evidential classifier based on Dempster-Shafer theory and deep
learning [6.230751621285322]
We propose a new classification system based on Dempster-Shafer (DS) theory and a convolutional neural network (CNN) architecture for set-valued classification.
Experiments on image recognition, signal processing, and semantic-relationship classification tasks demonstrate that the proposed combination of deep CNN, DS layer, and expected utility layer makes it possible to improve classification accuracy.
arXiv Detail & Related papers (2021-03-25T01:29:05Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.