OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic
Segmentation
- URL: http://arxiv.org/abs/2307.15588v2
- Date: Thu, 21 Dec 2023 09:47:19 GMT
- Title: OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic
Segmentation
- Authors: Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen,
Kailun Yang
- Abstract summary: We propose a new paradigm, Omni-Aperture Fusion model (OAFuser) for light field cameras.
OAFuser discovers the angular information from sub-aperture images to generate a semantically consistent result.
Our proposed OAFuser achieves state-of-the-art performance on the UrbanLF-Real and -Syn datasets.
- Score: 51.739401680890325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Light field cameras, by harnessing the power of micro-lens array, are capable
of capturing intricate angular and spatial details. This allows for acquiring
complex light patterns and details from multiple angles, significantly
enhancing the precision of image semantic segmentation, a critical aspect of
scene interpretation in vision intelligence. However, the extensive angular
information of light field cameras contains a large amount of redundant data,
which is overwhelming for the limited hardware resources of intelligent
vehicles. Besides, inappropriate compression leads to information corruption
and data loss. To excavate representative information, we propose a new
paradigm, Omni-Aperture Fusion model (OAFuser), which leverages dense context
from the central view and discovers the angular information from sub-aperture
images to generate a semantically consistent result. To avoid feature loss
during network propagation and simultaneously streamline the redundant
information from the light field camera, we present a simple yet very effective
Sub-Aperture Fusion Module (SAFM) to embed sub-aperture images into angular
features without any additional memory cost. Furthermore, to address the
mismatched spatial information across viewpoints, we present a Center Angular
Rectification Module (CARM) to realize feature resorting and prevent feature
occlusion caused by asymmetric information. Our proposed OAFuser achieves
state-of-the-art performance on the UrbanLF-Real and -Syn datasets and sets a
new record of 84.93% in mIoU on the UrbanLF-Real Extended dataset, with a gain
of +4.53%. The source code of OAFuser will be available at
https://github.com/FeiBryantkit/OAFuser.
Related papers
- LFSamba: Marry SAM with Mamba for Light Field Salient Object Detection [9.787855464038673]
A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information.
In this work, a state-of-the-art salient object detection model for multi-focus light field images, called LFSamba, is introduced.
arXiv Detail & Related papers (2024-11-11T01:37:32Z) - FusionMamba: Efficient Remote Sensing Image Fusion with State Space Model [35.57157248152558]
Current deep learning (DL) methods typically employ convolutional neural networks (CNNs) or Transformers for feature extraction and information integration.
We propose FusionMamba, an innovative method for efficient remote sensing image fusion.
arXiv Detail & Related papers (2024-04-11T17:29:56Z) - LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras [21.224449211575646]
We have identified two overlooked issues for the LF salient object detection (SOD) task.
Previous approaches predominantly employ a customized two-stream design to discover the spatial and depth features within light field images.
The network struggles to learn the implicit angular information between different images due to a lack of intra-network data connectivity.
We propose an efficient paradigm (LF Tracy) to address those issues.
arXiv Detail & Related papers (2024-01-30T03:17:02Z) - Beyond Subspace Isolation: Many-to-Many Transformer for Light Field
Image Super-resolution [5.277207972856879]
We introduce a novel Many-to-Many Transformer (M2MT) for light field image super-resolution tasks.
M2MT aggregates angular information in the spatial subspace before performing the self-attention mechanism.
It enables complete access to all information across all sub-aperture images in a light field image.
arXiv Detail & Related papers (2024-01-01T12:48:23Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and
Semantic-Aware Alignment [63.83894701779067]
We propose LCPS, the first LiDAR-Camera Panoptic network.
In our approach, we conduct LiDAR-Camera fusion in three stages.
Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
arXiv Detail & Related papers (2023-08-03T10:57:58Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.