3D Multimodal Image Registration for Plant Phenotyping
- URL: http://arxiv.org/abs/2407.02946v1
- Date: Wed, 3 Jul 2024 09:29:46 GMT
- Title: 3D Multimodal Image Registration for Plant Phenotyping
- Authors: Eric Stumpe, Gernot Bodner, Francesco Flagiello, Matthias Zeppelzauer,
- Abstract summary: The use of multiple camera technologies in a combined multimodal monitoring system for plant phenotyping offers promising benefits.
The effective utilization of cross-modal patterns is dependent on precise image registration to achieve pixel-accurate alignment.
We propose a novel multimodal 3D image registration method that addresses these challenges by integrating depth information from a time-of-flight camera into the registration process.
- Score: 0.6697966247860049
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of multiple camera technologies in a combined multimodal monitoring system for plant phenotyping offers promising benefits. Compared to configurations that only utilize a single camera technology, cross-modal patterns can be recorded that allow a more comprehensive assessment of plant phenotypes. However, the effective utilization of cross-modal patterns is dependent on precise image registration to achieve pixel-accurate alignment, a challenge often complicated by parallax and occlusion effects inherent in plant canopy imaging. In this study, we propose a novel multimodal 3D image registration method that addresses these challenges by integrating depth information from a time-of-flight camera into the registration process. By leveraging depth data, our method mitigates parallax effects and thus facilitates more accurate pixel alignment across camera modalities. Additionally, we introduce an automated mechanism to identify and differentiate different types of occlusions, thereby minimizing the introduction of registration errors. To evaluate the efficacy of our approach, we conduct experiments on a diverse image dataset comprising six distinct plant species with varying leaf geometries. Our results demonstrate the robustness of the proposed registration algorithm, showcasing its ability to achieve accurate alignment across different plant types and camera compositions. Compared to previous methods it is not reliant on detecting plant specific image features and can thereby be utilized for a wide variety of applications in plant sciences. The registration approach principally scales to arbitrary numbers of cameras with different resolutions and wavelengths. Overall, our study contributes to advancing the field of plant phenotyping by offering a robust and reliable solution for multimodal image registration.
Related papers
- Design and Identification of Keypoint Patches in Unstructured Environments [7.940068522906917]
Keypoint identification in an image allows direct mapping from raw images to 2D coordinates.
We propose four simple yet distinct designs that consider various scale, rotation and camera projection.
We customize the Superpoint network to ensure robust detection under various types of image degradation.
arXiv Detail & Related papers (2024-10-01T09:05:50Z) - Learning Robust Multi-Scale Representation for Neural Radiance Fields
from Unposed Images [65.41966114373373]
We present an improved solution to the neural image-based rendering problem in computer vision.
The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
arXiv Detail & Related papers (2023-11-08T08:18:23Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z) - Cross-Camera Deep Colorization [10.254243409261898]
We propose an end-to-end convolutional neural network to align and fuse images from a color-plus-mono dual-camera system.
Our method consistently achieves substantial improvements, i.e., around 10dB PSNR gain.
arXiv Detail & Related papers (2022-08-26T11:02:14Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - DeepMultiCap: Performance Capture of Multiple Characters Using Sparse
Multiview Cameras [63.186486240525554]
DeepMultiCap is a novel method for multi-person performance capture using sparse multi-view cameras.
Our method can capture time varying surface details without the need of using pre-scanned template models.
arXiv Detail & Related papers (2021-05-01T14:32:13Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - Translate to Adapt: RGB-D Scene Recognition across Domains [18.40373730109694]
In this work we put under the spotlight the existence of a possibly severe domain shift issue within multi-modality scene recognition datasets.
We present a method based on self-supervised inter-modality translation able to adapt across different camera domains.
arXiv Detail & Related papers (2021-03-26T18:20:29Z) - PlenoptiCam v1.0: A light-field imaging framework [8.467466998915018]
Light-field cameras play a vital role for rich 3-D information retrieval in narrow range depth sensing applications.
Key obstacle in composing light-fields from exposures taken by a plenoptic camera is to calibrate computationally, align and rearrange four-dimensional image data.
Several attempts have been proposed to enhance the overall image quality by tailoring pipelines dedicated to particular plenoptic cameras.
arXiv Detail & Related papers (2020-10-14T09:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.