Global Meets Local: Effective Multi-Label Image Classification via
Category-Aware Weak Supervision
- URL: http://arxiv.org/abs/2211.12716v1
- Date: Wed, 23 Nov 2022 05:39:17 GMT
- Title: Global Meets Local: Effective Multi-Label Image Classification via
Category-Aware Weak Supervision
- Authors: Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao,
Tianliang Zhang, Wenlong Wu, Wei Zhang, Chengjie Wang, Yuan Xie
- Abstract summary: This paper builds a unified framework to perform effective noisy-proposal suppression.
We develop a cross-granularity attention module to explore the complementary information between global and local features.
Our framework achieves superior performance over state-of-the-art methods.
- Score: 37.761378069277676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label image classification, which can be categorized into
label-dependency and region-based methods, is a challenging problem due to the
complex underlying object layouts. Although region-based methods are less
likely to encounter issues with model generalizability than label-dependency
methods, they often generate hundreds of meaningless or noisy proposals with
non-discriminative information, and the contextual dependency among the
localized regions is often ignored or over-simplified. This paper builds a
unified framework to perform effective noisy-proposal suppression and to
interact between global and local features for robust feature learning.
Specifically, we propose category-aware weak supervision to concentrate on
non-existent categories so as to provide deterministic information for local
feature learning, restricting the local branch to focus on more high-quality
regions of interest. Moreover, we develop a cross-granularity attention module
to explore the complementary information between global and local features,
which can build the high-order feature correlation containing not only
global-to-local, but also local-to-local relations. Both advantages guarantee a
boost in the performance of the whole network. Extensive experiments on two
large-scale datasets (MS-COCO and VOC 2007) demonstrate that our framework
achieves superior performance over state-of-the-art methods.
Related papers
- Adaptive Global-Local Representation Learning and Selection for
Cross-Domain Facial Expression Recognition [54.334773598942775]
Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition (CD-FER)
We propose an Adaptive Global-Local Representation Learning and Selection framework.
arXiv Detail & Related papers (2024-01-20T02:21:41Z) - A Task-aware Dual Similarity Network for Fine-grained Few-shot Learning [19.90385022248391]
Task-aware Dual Similarity Network( TDSNet) proposed to explore global invariant features and discriminative local details.
TDSNet achieves competitive performance by comparing with other state-of-the-art algorithms.
arXiv Detail & Related papers (2022-10-22T04:24:55Z) - S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent
Reinforcement Learning? [26.265100805551764]
Collaborative multi-agent reinforcement learning (MARL) has been widely used in many practical applications.
We propose a sparse state based MARL framework, which utilizes a sparse attention mechanism to discard irrelevant information in local observations.
arXiv Detail & Related papers (2022-06-20T07:33:40Z) - Local-Global Associative Frame Assemble in Video Re-ID [57.7470971197962]
Noisy and unrepresentative frames in automatically generated object bounding boxes from video sequences cause challenges in learning discriminative representations in video re-identification (Re-ID)
Most existing methods tackle this problem by assessing the importance of video frames according to either their local part alignments or global appearance correlations separately.
In this work, we explore jointly both local alignments and global correlations with further consideration of their mutual promotion/reinforcement.
arXiv Detail & Related papers (2021-10-22T19:07:39Z) - Discriminative Region-based Multi-Label Zero-Shot Learning [145.0952336375342]
Multi-label zero-shot learning (ZSL) is a more realistic counter-part of standard single-label ZSL.
We propose an alternate approach towards region-based discriminability-preserving ZSL.
arXiv Detail & Related papers (2021-08-20T17:56:47Z) - Re-rank Coarse Classification with Local Region Enhanced Features for
Fine-Grained Image Recognition [22.83821575990778]
We re-rank the TopN classification results by using the local region enhanced embedding features to improve the Top1 accuracy.
To learn more effective semantic global features, we design a multi-level loss over an automatically constructed hierarchical category structure.
Our method achieves state-of-the-art performance on three benchmarks: CUB-200-2011, Stanford Cars, and FGVC Aircraft.
arXiv Detail & Related papers (2021-02-19T11:30:25Z) - Gait Recognition via Effective Global-Local Feature Representation and
Local Temporal Aggregation [28.721376937882958]
Gait recognition is one of the most important biometric technologies and has been applied in many fields.
Recent gait recognition frameworks represent each gait frame by descriptors extracted from either global appearances or local regions of humans.
We propose a novel feature extraction and fusion framework to achieve discriminative feature representations for gait recognition.
arXiv Detail & Related papers (2020-11-03T04:07:13Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.