Disentangled Non-Local Neural Networks
- URL: http://arxiv.org/abs/2006.06668v2
- Date: Tue, 8 Sep 2020 14:12:09 GMT
- Title: Disentangled Non-Local Neural Networks
- Authors: Minghao Yin and Zhuliang Yao and Yue Cao and Xiu Li and Zheng Zhang
and Stephen Lin and Han Hu
- Abstract summary: We study the non-local block in depth, where we find that its attention can be split into two terms.
We present the disentangled non-local block, where the two terms are decoupled to facilitate learning for both terms.
- Score: 68.92293183542131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The non-local block is a popular module for strengthening the context
modeling ability of a regular convolutional neural network. This paper first
studies the non-local block in depth, where we find that its attention
computation can be split into two terms, a whitened pairwise term accounting
for the relationship between two pixels and a unary term representing the
saliency of every pixel. We also observe that the two terms trained alone tend
to model different visual clues, e.g. the whitened pairwise term learns
within-region relationships while the unary term learns salient boundaries.
However, the two terms are tightly coupled in the non-local block, which
hinders the learning of each. Based on these findings, we present the
disentangled non-local block, where the two terms are decoupled to facilitate
learning for both terms. We demonstrate the effectiveness of the decoupled
design on various tasks, such as semantic segmentation on Cityscapes, ADE20K
and PASCAL Context, object detection on COCO, and action recognition on
Kinetics.
Related papers
- N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields [112.02885337510716]
Nested Neural Feature Fields (N2F2) is a novel approach that employs hierarchical supervision to learn a single feature field.
We leverage a 2D class-agnostic segmentation model to provide semantically meaningful pixel groupings at arbitrary scales in the image space.
Our approach outperforms the state-of-the-art feature field distillation methods on tasks such as open-vocabulary 3D segmentation and localization.
arXiv Detail & Related papers (2024-03-16T18:50:44Z) - Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised
Multi-Organ Segmentation [12.798684146496754]
We propose a two-stage Dual Contrastive Learning Network for semi-supervised MoS.
In Stage 1, we develop a similarity-guided global contrastive learning to explore the implicit continuity and similarity among images.
In Stage 2, we present an organ-aware local contrastive learning to further attract the class representations.
arXiv Detail & Related papers (2024-03-06T07:39:33Z) - BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning [26.400567961735234]
Correspondence pruning aims to establish reliable correspondences between two related images.
Existing approaches often employ a progressive strategy to handle the local and global contexts.
We propose a parallel context learning strategy that involves acquiring bilateral consensus for the two-view correspondence pruning task.
arXiv Detail & Related papers (2024-01-07T11:38:15Z) - Semantic Connectivity-Driven Pseudo-labeling for Cross-domain
Segmentation [89.41179071022121]
Self-training is a prevailing approach in cross-domain semantic segmentation.
We propose a novel approach called Semantic Connectivity-driven pseudo-labeling.
This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics.
arXiv Detail & Related papers (2023-12-11T12:29:51Z) - Associating Spatially-Consistent Grouping with Text-supervised Semantic
Segmentation [117.36746226803993]
We introduce self-supervised spatially-consistent grouping with text-supervised semantic segmentation.
Considering the part-like grouped results, we further adapt a text-supervised model from image-level to region-level recognition.
Our method achieves 59.2% mIoU and 32.4% mIoU on Pascal VOC and Pascal Context benchmarks.
arXiv Detail & Related papers (2023-04-03T16:24:39Z) - Denoised Non-Local Neural Network for Semantic Segmentation [18.84185406522064]
We propose a Denoised Non-Local Network (Denoised NL) to eliminate the inter-class and intra-class noises respectively.
Our proposed NL can achieve the state-of-the-art performance of 83.5% and 46.69% mIoU on Cityscapes and ADE20K, respectively.
arXiv Detail & Related papers (2021-10-27T06:16:31Z) - Kinship Verification Based on Cross-Generation Feature Interaction
Learning [53.62256887837659]
Kinship verification from facial images has been recognized as an emerging yet challenging technique in computer vision applications.
We propose a novel cross-generation feature interaction learning (CFIL) framework for robust kinship verification.
arXiv Detail & Related papers (2021-09-07T01:50:50Z) - Unifying Nonlocal Blocks for Neural Networks [43.107708207022526]
Nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks.
We provide a new perspective to interpret them, where we view them as a set of graph filters generated on a fully-connected graph.
We propose an efficient and robust spectral nonlocal block, which can be more robust and flexible to catch long-range dependencies.
arXiv Detail & Related papers (2021-08-05T08:34:12Z) - Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning [86.45526827323954]
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training.
We propose an iterative algorithm to learn such pairwise relations.
We show that the proposed algorithm performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2020-02-19T10:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.