Related papers: SpectralZoom: Efficient Segmentation with an Adaptive Hyperspectral Camera

SpectralZoom: Efficient Segmentation with an Adaptive Hyperspectral Camera

URL: http://arxiv.org/abs/2406.04287v1
Date: Thu, 6 Jun 2024 17:33:23 GMT
Title: SpectralZoom: Efficient Segmentation with an Adaptive Hyperspectral Camera
Authors: Jackson Arnold, Sophia Rossi, Chloe Petrosino, Ethan Mitchell, Sanjeev J. Koppal,
Abstract summary: We propose a vision transformer-based (ViT) algorithm that alleviates both the captured data footprint and the computational load for hyperspectral segmentation. Our camera is able to adaptively sample image regions or patches at different resolutions, instead of capturing the entire hyperspectral cube at one high resolution.
Score: 3.0175628677371935
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hyperspectral image segmentation is crucial for many fields such as agriculture, remote sensing, biomedical imaging, battlefield sensing and astronomy. However, the challenge of hyper and multi spectral imaging is its large data footprint. We propose both a novel camera design and a vision transformer-based (ViT) algorithm that alleviate both the captured data footprint and the computational load for hyperspectral segmentation. Our camera is able to adaptively sample image regions or patches at different resolutions, instead of capturing the entire hyperspectral cube at one high resolution. Our segmentation algorithm works in concert with the camera, applying ViT-based segmentation only to adaptively selected patches. We show results both in simulation and on a real hardware platform demonstrating both accurate segmentation results and reduced computational burden.

Related papers

CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis [75.25966323298003]
Spectral imaging offers promising applications across diverse domains, including medicine and urban scene understanding. variability in channel dimensionality and captured wavelengths among spectral cameras impede the development of AI-driven methodologies. We introduce $textbfCARL$, a model for $textbfC$amera-$textbfA$gnostic $textbfR$esupervised $textbfL$ across RGB, multispectral, and hyperspectral imaging modalities.
arXiv Detail & Related papers (2025-04-27T13:06:40Z)
Your ViT is Secretly an Image Segmentation Model [50.71238842539735]
Vision Transformers (ViTs) have shown remarkable performance and scalability across various computer vision tasks. We show that inductive biases introduced by task-specific components can instead be learned by the ViT itself. We introduce the Mask Transformer (EoMT), which repurposes the plain ViT architecture to conduct image segmentation.
arXiv Detail & Related papers (2025-03-24T19:56:02Z)
Redundancy-Aware Camera Selection for Indoor Scene Neural Rendering [54.468355408388675]
We build a similarity matrix that incorporates both the spatial diversity of the cameras and the semantic variation of the images. We apply a diversity-based sampling algorithm to optimize the camera selection. We also develop a new dataset, IndoorTraj, which includes long and complex camera movements captured by humans in virtual indoor environments.
arXiv Detail & Related papers (2024-09-11T08:36:49Z)
View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene. Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output. We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z)
Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search [51.89707241449435]
In this paper, we address the challenge of integrating multi-head self-attention into high-resolution representation CNNs efficiently. We develop a multi-target multi-branch supernet method, which fully utilizes the advantages of high-resolution features. We present a series of models via the Hybrid Convolutional-Transformer Architecture Search (HyCTAS) method that searches for the best hybrid combination of light-weight convolution layers and memory-efficient self-attention layers.
arXiv Detail & Related papers (2024-03-15T15:47:54Z)
Memory-Constrained Semantic Segmentation for Ultra-High Resolution UAV Imagery [35.96063342025938]
This paper explores the intricate problem of achieving efficient and effective segmentation of ultra-high resolution UAV imagery. We propose a GPU memory-efficient and effective framework for local inference without accessing the context beyond local patches. We present an efficient memory-based interaction scheme to correct potential semantic bias of the underlying high-resolution information.
arXiv Detail & Related papers (2023-10-07T07:44:59Z)
Vision Transformers: From Semantic Segmentation to Dense Prediction [139.15562023284187]
We explore the global context learning potentials of vision transformers (ViTs) for dense visual prediction. Our motivation is that through learning global context at full receptive field layer by layer, ViTs may capture stronger long-range dependency information. We formulate a family of Hierarchical Local-Global (HLG) Transformers, characterized by local attention within windows and global-attention across windows in a pyramidal architecture.
arXiv Detail & Related papers (2022-07-19T15:49:35Z)
Unsupervised Segmentation of Hyperspectral Remote Sensing Images with Superpixels [22.92045376407794]
We propose an unsupervised method for hyperspectral remote sensing image segmentation. The method exploits the mean-shift clustering algorithm that takes as input a preliminary hyperspectral superpixels segmentation together with the spectral pixel information. Results demonstrate the validity of the proposed method in comparison with the state of the art.
arXiv Detail & Related papers (2022-04-26T13:20:33Z)
Real-time Hyperspectral Imaging in Hardware via Trained Metasurface Encoders [55.16861072631285]
Hyperspectral imaging has attracted significant attention to identify spectral signatures for image classification and automated pattern recognition in computer vision. This work introduces Hyplex, a new integrated architecture addressing the limitations discussed above. Hyplex is a CMOS-compatible, fast hyperspectral camera that replaces bulk optics with nanoscale metasurfaces inversely designed through artificial intelligence.
arXiv Detail & Related papers (2022-04-05T09:52:51Z)
Panoptic SwiftNet: Pyramidal Fusion for Real-time Panoptic Segmentation [0.0]
Many applications require fast inference over large input resolutions on affordable or even embedded hardware. We propose to achieve this goal by trading off backbone capacity for multi-scale feature extraction. We present panoptic experiments on Cityscapes, Vistas, COCO and the BSB-Aerial dataset.
arXiv Detail & Related papers (2022-03-15T13:47:40Z)
Toward Efficient Hyperspectral Image Processing inside Camera Pixels [1.6449390849183356]
Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands. To mitigate this problem, we propose a form of processing-in-pixel (PIP) Our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit.
arXiv Detail & Related papers (2022-03-11T01:06:02Z)
XCiT: Cross-Covariance Image Transformers [73.33400159139708]
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens. The resulting cross-covariance attention (XCA) has linear complexity in the number of tokens, and allows efficient processing of high-resolution images.
arXiv Detail & Related papers (2021-06-17T17:33:35Z)
SASSI -- Super-Pixelated Adaptive Spatio-Spectral Imaging [17.2152544145501]
We introduce a novel video-rate hyperspectral imager with high spatial, and temporal resolutions. A scene-adaptive spatial sampling of an hyperspectral scene, guided by its super-pixel segmented image, is capable of obtaining high-quality reconstructions. We validate the proposed technique with extensive simulations as well as a lab prototype that measures hyperspectral video at a spatial resolution of $600 times 900$ pixels.
arXiv Detail & Related papers (2020-12-28T21:34:18Z)
Efficient Video Semantic Segmentation with Labels Propagation and Refinement [138.55845680523908]
This paper tackles the problem of real-time semantic segmentation of high definition videos using a hybrid GPU / CPU approach. We propose an Efficient Video(EVS) pipeline that combines: (i) On the CPU, a very fast optical flow method, that is used to exploit the temporal aspect of the video and propagate semantic information from one frame to the next. On the popular Cityscapes dataset with high resolution frames (2048 x 1024), the proposed operating points range from 80 to 1000 Hz on a single GPU and CPU.
arXiv Detail & Related papers (2019-12-26T11:45:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.