Related papers: NOAH: Learning Pairwise Object Category Attentions for Image Classification

NOAH: Learning Pairwise Object Category Attentions for Image Classification

URL: http://arxiv.org/abs/2402.02377v1
Date: Sun, 4 Feb 2024 07:19:40 GMT
Title: NOAH: Learning Pairwise Object Category Attentions for Image Classification
Authors: Chao Li, Aojun Zhou, Anbang Yao
Abstract summary: Non-glObal Attentive Head (NOAH) is a new form of dot-product attention called pairwise object category attention (POCA) As a drop-in design, NOAH can be easily used to replace existing heads of various types of DNNs.
Score: 26.077836657775403
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A modern deep neural network (DNN) for image classification tasks typically consists of two parts: a backbone for feature extraction, and a head for feature encoding and class predication. We observe that the head structures of mainstream DNNs adopt a similar feature encoding pipeline, exploiting global feature dependencies while disregarding local ones. In this paper, we revisit the feature encoding problem, and propose Non-glObal Attentive Head (NOAH) that relies on a new form of dot-product attention called pairwise object category attention (POCA), efficiently exploiting spatially dense category-specific attentions to augment classification performance. NOAH introduces a neat combination of feature split, transform and merge operations to learn POCAs at local to global scales. As a drop-in design, NOAH can be easily used to replace existing heads of various types of DNNs, improving classification performance while maintaining similar model efficiency. We validate the effectiveness of NOAH on ImageNet classification benchmark with 25 DNN architectures spanning convolutional neural networks, vision transformers and multi-layer perceptrons. In general, NOAH is able to significantly improve the performance of lightweight DNNs, e.g., showing 3.14\%|5.3\%|1.9\% top-1 accuracy improvement to MobileNetV2 (0.5x)|Deit-Tiny (0.5x)|gMLP-Tiny (0.5x). NOAH also generalizes well when applied to medium-size and large-size DNNs. We further show that NOAH retains its efficacy on other popular multi-class and multi-label image classification benchmarks as well as in different training regimes, e.g., showing 3.6\%|1.1\% mAP improvement to large ResNet101|ViT-Large on MS-COCO dataset. Project page: https://github.com/OSVAI/NOAH.

Related papers

Graph Neural Networks Need Cluster-Normalize-Activate Modules [19.866482154218374]
Graph Neural Networks (GNNs) are non-Euclidean deep learning models for graph-structured data. We propose a plug-and-play module consisting of three steps: Cluster-Normalize-Activate (CNA) CNA significantly improves the accuracy over the state-of-the-art in node classification and property prediction tasks.
arXiv Detail & Related papers (2024-12-05T10:59:20Z)
DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification. The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z)
Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains [23.10912424714101]
Recently discovered Neural collapse (NC) phenomenon states that the last-layer weights of Deep Neural Networks converge to the so-called Equiangular Tight Frame (ETF) simplex, at the terminal phase of their training. Inspired by NC properties, we explore in this paper the transferability of DNN models trained with their last layer weight fixed according to ETF.
arXiv Detail & Related papers (2024-02-28T15:52:30Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
Hybrid Graph Neural Networks for Few-Shot Learning [85.93495480949079]
Graph neural networks (GNNs) have been used to tackle the few-shot learning problem. Under the inductive setting, existing GNN based methods are less competitive. We propose a novel hybrid GNN model consisting of two GNNs, an instance GNN and a prototype GNN.
arXiv Detail & Related papers (2021-12-13T10:20:15Z)
From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness [23.279464786779787]
We introduce a general framework to uplift any MPNN to be more expressive. Our framework is strictly more powerful than 1&2-WL, and is not less powerful than 3-WL. Our method sets new state-of-the-art performance by large margins for several well-known graph ML tasks.
arXiv Detail & Related papers (2021-10-07T19:08:08Z)
Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution. First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers. Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z)
Strengthening the Training of Convolutional Neural Networks By Using Walsh Matrix [0.0]
We have modified the training and structure of DNN to increase the classification performance. A minimum distance network (MDN) following the last layer of the convolutional neural network (CNN) is used as the classifier. In different areas, it has been observed that a higher classification performance was obtained by using the DivFE with less number of nodes.
arXiv Detail & Related papers (2021-03-31T18:06:11Z)
Patch Based Classification of Remote Sensing Data: A Comparison of 2D-CNN, SVM and NN Classifiers [0.0]
We compare performance of patch based SVM and NN with that of a deep learning algorithms comprising of 2D-CNN and fully connected layers. Results with both datasets suggest the effectiveness of patch based SVM and NN.
arXiv Detail & Related papers (2020-06-21T11:07:37Z)
Towards Deeper Graph Neural Networks with Differentiable Group Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.