Multiview Transformer: Rethinking Spatial Information in Hyperspectral
Image Classification
- URL: http://arxiv.org/abs/2310.07186v1
- Date: Wed, 11 Oct 2023 04:25:24 GMT
- Title: Multiview Transformer: Rethinking Spatial Information in Hyperspectral
Image Classification
- Authors: Jie Zhang, Yongshan Zhang, Yicong Zhou
- Abstract summary: Identifying the land cover category for each pixel in a hyperspectral image relies on spectral and spatial information.
In this article, we investigate that scene-specific but not essential correlations may be recorded in an HSI cuboid.
We propose a multiview transformer for HSI classification, which consists of multiview principal component analysis (MPCA), spectral encoder-decoder (SED), and spatial-pooling tokenization transformer (SPTT)
- Score: 43.17196501332728
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Identifying the land cover category for each pixel in a hyperspectral image
(HSI) relies on spectral and spatial information. An HSI cuboid with a specific
patch size is utilized to extract spatial-spectral feature representation for
the central pixel. In this article, we investigate that scene-specific but not
essential correlations may be recorded in an HSI cuboid. This additional
information improves the model performance on existing HSI datasets and makes
it hard to properly evaluate the ability of a model. We refer to this problem
as the spatial overfitting issue and utilize strict experimental settings to
avoid it. We further propose a multiview transformer for HSI classification,
which consists of multiview principal component analysis (MPCA), spectral
encoder-decoder (SED), and spatial-pooling tokenization transformer (SPTT).
MPCA performs dimension reduction on an HSI via constructing spectral multiview
observations and applying PCA on each view data to extract low-dimensional view
representation. The combination of view representations, named multiview
representation, is the dimension reduction output of the MPCA. To aggregate the
multiview information, a fully-convolutional SED with a U-shape in spectral
dimension is introduced to extract a multiview feature map. SPTT transforms the
multiview features into tokens using the spatial-pooling tokenization strategy
and learns robust and discriminative spatial-spectral features for land cover
identification. Classification is conducted with a linear classifier.
Experiments on three HSI datasets with rigid settings demonstrate the
superiority of the proposed multiview transformer over the state-of-the-art
methods.
Related papers
- ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers [9.271932084757646]
3D occupancy represents the entire scene without distinguishing between foreground and background by the physical space into a grid map.
We propose our learning-first view attention mechanism for effective multi-view feature aggregation.
We present FlowOcc3D, a benchmark built on top existing high-quality datasets.
arXiv Detail & Related papers (2024-05-07T13:15:07Z) - TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation [64.65950381870742]
We propose a cyclic token permutation framework for multi-spectral object ReID, dubbled TOP-ReID.
We also propose a Token Permutation Module (TPM) for cyclic multi-spectral feature aggregation.
Our proposed framework can generate more discriminative multi-spectral features for robust object ReID.
arXiv Detail & Related papers (2023-12-15T08:54:15Z) - Multi-Spectral Image Stitching via Spatial Graph Reasoning [52.27796682972484]
We propose a spatial graph reasoning based multi-spectral image stitching method.
We embed multi-scale complementary features from the same view position into a set of nodes.
By introducing long-range coherence along spatial and channel dimensions, the complementarity of pixel relations and channel interdependencies aids in the reconstruction of aligned multi-view features.
arXiv Detail & Related papers (2023-07-31T15:04:52Z) - Multi-Scale U-Shape MLP for Hyperspectral Image Classification [13.85573689689951]
Two challenges in identifying pixels of the hyperspectral image are respectively representing the correlated information among the local and global, as well as the abundant parameters of the model.
We propose a Multi-Scale U-shape Multi-Layer Perceptron (MUMLP) model consisting of the designed MSC (Multi-Scale Channel) block and the U-shape Multi-Layer Perceptron structure.
Our model can outperform state-of-the-art methods across-the-board on three wide-adopted public datasets.
arXiv Detail & Related papers (2023-07-05T08:52:27Z) - Object Detection in Hyperspectral Image via Unified Spectral-Spatial
Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images.
S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z) - DCN-T: Dual Context Network with Transformer for Hyperspectral Image
Classification [109.09061514799413]
Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.
We propose a tri-spectral image generation pipeline that transforms HSI into high-quality tri-spectral images.
Our proposed method outperforms state-of-the-art methods for HSI classification.
arXiv Detail & Related papers (2023-04-19T18:32:52Z) - Diagnose Like a Pathologist: Transformer-Enabled Hierarchical
Attention-Guided Multiple Instance Learning for Whole Slide Image
Classification [39.41442041007595]
Multiple Instance Learning and transformers are increasingly popular in histopathology Whole Slide Image (WSI) classification.
We propose a Hierarchical Attention-Guided Multiple Instance Learning framework to fully exploit the WSIs.
Within this framework, an Integrated Attention Transformer is proposed to further enhance the performance of the transformer.
arXiv Detail & Related papers (2023-01-19T15:38:43Z) - Sketched Multi-view Subspace Learning for Hyperspectral Anomalous Change
Detection [12.719327447589345]
A sketched multi-view subspace learning model is proposed for anomalous change detection.
The proposed model preserves major information from the image pairs and improves computational complexity.
experiments are conducted on a benchmark hyperspectral remote sensing dataset and a natural hyperspectral dataset.
arXiv Detail & Related papers (2022-10-09T14:08:17Z) - Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image
Reconstruction [127.20208645280438]
Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement.
Modeling the inter-spectra interactions is beneficial for HSI reconstruction.
Mask-guided Spectral-wise Transformer (MST) proposes a novel framework for HSI reconstruction.
arXiv Detail & Related papers (2021-11-15T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.