A Multi-modal Garden Dataset and Hybrid 3D Dense Reconstruction
Framework Based on Panoramic Stereo Images for a Trimming Robot
- URL: http://arxiv.org/abs/2305.06278v1
- Date: Wed, 10 May 2023 16:15:16 GMT
- Title: A Multi-modal Garden Dataset and Hybrid 3D Dense Reconstruction
Framework Based on Panoramic Stereo Images for a Trimming Robot
- Authors: Can Pu, Chuanyu Yang, Jinnian Pu, Radim Tylecek, Robert B. Fisher
- Abstract summary: Our proposed solution is based on a newly-designed panoramic stereo camera along with a hybrid novel software framework that consists of three fusion modules.
In the disparity fusion module, rectified stereo images produce the initial disparity maps using multiple stereo vision algorithms.
The pose fusion module adopts a two-stage global-coarse-to-local-fine strategy.
In the volumetric fusion module, the global poses of all the nodes are used to integrate the single-view point clouds into the volume.
- Score: 7.248231584821008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recovering an outdoor environment's surface mesh is vital for an agricultural
robot during task planning and remote visualization. Our proposed solution is
based on a newly-designed panoramic stereo camera along with a hybrid novel
software framework that consists of three fusion modules. The panoramic stereo
camera with a pentagon shape consists of 5 stereo vision camera pairs to stream
synchronized panoramic stereo images for the following three fusion modules. In
the disparity fusion module, rectified stereo images produce the initial
disparity maps using multiple stereo vision algorithms. Then, these initial
disparity maps, along with the intensity images, are input into a disparity
fusion network to produce refined disparity maps. Next, the refined disparity
maps are converted into full-view point clouds or single-view point clouds for
the pose fusion module. The pose fusion module adopts a two-stage
global-coarse-to-local-fine strategy. In the first stage, each pair of
full-view point clouds is registered by a global point cloud matching algorithm
to estimate the transformation for a global pose graph's edge, which
effectively implements loop closure. In the second stage, a local point cloud
matching algorithm is used to match single-view point clouds in different
nodes. Next, we locally refine the poses of all corresponding edges in the
global pose graph using three proposed rules, thus constructing a refined pose
graph. The refined pose graph is optimized to produce a global pose trajectory
for volumetric fusion. In the volumetric fusion module, the global poses of all
the nodes are used to integrate the single-view point clouds into the volume to
produce the mesh of the whole garden. The proposed framework and its three
fusion modules are tested on a real outdoor garden dataset to show the
superiority of the performance.
Related papers
- Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation [64.07560335451723]
CoSER is a novel consistent dense Multiview Text-to-Image Generator for Text-to-3D.
It achieves both efficiency and quality by meticulously learning neighbor-view coherence.
It aggregates information along motion paths explicitly defined by physical principles to refine details.
arXiv Detail & Related papers (2024-08-23T15:16:01Z) - Multiway Point Cloud Mosaicking with Diffusion and Global Optimization [74.3802812773891]
We introduce a novel framework for multiway point cloud mosaicking (named Wednesday)
At the core of our approach is ODIN, a learned pairwise registration algorithm that identifies overlaps and refines attention scores.
Tested on four diverse, large-scale datasets, our method state-of-the-art pairwise and rotation registration results by a large margin on all benchmarks.
arXiv Detail & Related papers (2024-03-30T17:29:13Z) - DiffPoint: Single and Multi-view Point Cloud Reconstruction with ViT
Based Diffusion Model [10.253402444122084]
We propose a neat and powerful architecture called DiffPoint that combines ViT and diffusion models for the task of point cloud reconstruction.
We evaluate DiffPoint on both single-view and multi-view reconstruction tasks and achieve state-of-the-art results.
arXiv Detail & Related papers (2024-02-17T10:18:40Z) - LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global
Cross-Modal Fusion [40.44084541717407]
novel Local-to-Global fusion network (LoGoNet)
LoGoNet ranks 1st on 3D object detection leaderboard.
For the first time, the detection performance on three classes surpasses 80 APH (L2) simultaneously.
arXiv Detail & Related papers (2023-03-07T02:00:34Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z) - A Graph-Matching Approach for Cross-view Registration of Over-view 2 and
Street-view based Point Clouds [4.742825811314168]
We propose a fully automated geo-registration method for cross-view data, which utilizes semantically segmented object boundaries as view-invariant features.
The proposed method models segments of buildings as nodes of graphs, both detected from the satellite-based and street-view based point clouds.
The matched nodes will be subject to a further optimization to allow precise-registration, followed by a constrained bundle adjustment on the street-view image to keep 2D29 3D consistencies.
arXiv Detail & Related papers (2022-02-14T16:43:28Z) - VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View
Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion.
We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z) - UniFuse: Unidirectional Fusion for 360$^{\circ}$ Panorama Depth
Estimation [11.680475784102308]
This paper introduces a new framework to fuse features from the two projections, unidirectionally feeding the cubemap features to the equirectangular features only at the decoding stage.
Experiments verify the effectiveness of our proposed fusion strategy and module, and our model achieves state-of-the-art performance on four popular datasets.
arXiv Detail & Related papers (2021-02-06T10:01:09Z) - Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in
Large Scenes [50.744452135300115]
We propose a deep neural network framework for multi-view crowd counting.
Our methods achieve state-of-the-art results compared to other multi-view counting baselines.
arXiv Detail & Related papers (2020-12-02T03:20:30Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.