INSIDE: Steering Spatial Attention with Non-Imaging Information in CNNs
- URL: http://arxiv.org/abs/2008.10418v1
- Date: Fri, 21 Aug 2020 13:32:05 GMT
- Title: INSIDE: Steering Spatial Attention with Non-Imaging Information in CNNs
- Authors: Grzegorz Jacenk\'ow, Alison Q. O'Neil, Brian Mohr, Sotirios A.
Tsaftaris
- Abstract summary: We consider the problem of integrating non-imaging information into segmentation networks to improve performance.
We propose a mechanism to allow for spatial localisation conditioned on non-imaging information.
Our method can be trained end-to-end and does not require additional supervision.
- Score: 14.095546881696311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of integrating non-imaging information into
segmentation networks to improve performance. Conditioning layers such as FiLM
provide the means to selectively amplify or suppress the contribution of
different feature maps in a linear fashion. However, spatial dependency is
difficult to learn within a convolutional paradigm. In this paper, we propose a
mechanism to allow for spatial localisation conditioned on non-imaging
information, using a feature-wise attention mechanism comprising a
differentiable parametrised function (e.g. Gaussian), prior to applying the
feature-wise modulation. We name our method INstance modulation with SpatIal
DEpendency (INSIDE). The conditioning information might comprise any factors
that relate to spatial or spatio-temporal information such as lesion location,
size, and cardiac cycle phase. Our method can be trained end-to-end and does
not require additional supervision. We evaluate the method on two datasets: a
new CLEVR-Seg dataset where we segment objects based on location, and the ACDC
dataset conditioned on cardiac phase and slice location within the volume. Code
and the CLEVR-Seg dataset are available at https://github.com/jacenkow/inside.
Related papers
- Kriformer: A Novel Spatiotemporal Kriging Approach Based on Graph Transformers [5.4381914710364665]
This study addresses posed by sparse sensor deployment and unreliable data by framing the problem as an environmental challenge.
A graphkriformer model, Kriformer, estimates data at locations without sensors by mining spatial and temporal correlations, even with limited resources.
arXiv Detail & Related papers (2024-09-23T11:01:18Z) - Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - A$^{2}$-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder [26.81539884309151]
Remote sensing (RS) data provide Earth observations across multiple dimensions, encompassing critical spatial, temporal, and spectral information.
Despite various pre-training methods tailored to the characteristics of RS data, a key limitation persists: the inability to effectively integrate spatial, temporal, and spectral information within a single unified model.
We propose an Anchor-Aware Masked AutoEncoder method (A$2$-MAE), leveraging intrinsic complementary information from the different kinds of images and geo-information to reconstruct the masked patches during the pre-training phase.
arXiv Detail & Related papers (2024-06-12T11:02:15Z) - FATE: Feature-Agnostic Transformer-based Encoder for learning
generalized embedding spaces in flow cytometry data [4.550634499956126]
We aim at effectively leveraging data with varying features, without the need to constrain the input space to the intersection of potential feature sets.
We propose a novel architecture that can directly process data without the necessity of aligned feature modalities.
The advantages of the model are demonstrated for automatic cancer cell detection in acute myeloid leukemia in flow data.
arXiv Detail & Related papers (2023-11-06T18:06:38Z) - Feature Selection using Sparse Adaptive Bottleneck Centroid-Encoder [1.2487990897680423]
We introduce a novel nonlinear model, Sparse Adaptive Bottleneckid-Encoder (SABCE), for determining the features that discriminate between two or more classes.
The algorithm is applied to various real-world data sets, including high-dimensional biological, image, speech, and accelerometer sensor data.
arXiv Detail & Related papers (2023-06-07T21:37:21Z) - Neural FIM for learning Fisher Information Metrics from point cloud data [71.07939200676199]
We propose neural FIM, a method for computing the Fisher information metric (FIM) from point cloud data.
We demonstrate its utility in selecting parameters for the PHATE visualization method as well as its ability to obtain information pertaining to local volume illuminating branching points and cluster centers embeddings of a toy dataset and two single-cell datasets of IPSC reprogramming and PBMCs (immune cells)
arXiv Detail & Related papers (2023-06-01T17:36:13Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - TC-Net: Triple Context Network for Automated Stroke Lesion Segmentation [0.5482532589225552]
We propose a new network, Triple Context Network (TC-Net), with the capture of spatial contextual information as the core.
Our network is evaluated on the open dataset ATLAS, achieving the highest score of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm.
arXiv Detail & Related papers (2022-02-28T11:12:16Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.