Infinite Leagues Under the Sea: Photorealistic 3D Underwater Terrain Generation by Latent Fractal Diffusion Models
- URL: http://arxiv.org/abs/2503.06784v1
- Date: Sun, 09 Mar 2025 21:43:37 GMT
- Title: Infinite Leagues Under the Sea: Photorealistic 3D Underwater Terrain Generation by Latent Fractal Diffusion Models
- Authors: Tianyi Zhang, Weiming Zhi, Joshua Mangelson, Matthew Johnson-Roberson,
- Abstract summary: We introduce DreamSea, a generative model to generate hyper-realistic underwater scenes.<n>DreamSea is trained on real-world image databases collected from underwater robot surveys.
- Score: 13.58353565350936
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper tackles the problem of generating representations of underwater 3D terrain. Off-the-shelf generative models, trained on Internet-scale data but not on specialized underwater images, exhibit downgraded realism, as images of the seafloor are relatively uncommon. To this end, we introduce DreamSea, a generative model to generate hyper-realistic underwater scenes. DreamSea is trained on real-world image databases collected from underwater robot surveys. Images from these surveys contain massive real seafloor observations and covering large areas, but are prone to noise and artifacts from the real world. We extract 3D geometry and semantics from the data with visual foundation models, and train a diffusion model that generates realistic seafloor images in RGBD channels, conditioned on novel fractal distribution-based latent embeddings. We then fuse the generated images into a 3D map, building a 3DGS model supervised by 2D diffusion priors which allows photorealistic novel view rendering. DreamSea is rigorously evaluated, demonstrating the ability to robustly generate large-scale underwater scenes that are consistent, diverse, and photorealistic. Our work drives impact in multiple domains, spanning filming, gaming, and robot simulation.
Related papers
- Aquatic-GS: A Hybrid 3D Representation for Underwater Scenes [6.549998173302729]
We propose Aquatic-GS, a hybrid 3D representation approach for underwater scenes that effectively represents both the objects and the water medium.
Specifically, we construct a Neural Water Field (NWF) to implicitly model the water parameters, while extending the latest 3D Gaussian Splatting (3DGS) to model the objects explicitly.
Both components are integrated through a physics-based underwater image formation model to represent complex underwater scenes.
arXiv Detail & Related papers (2024-10-31T22:24:56Z) - SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model [11.57677379828992]
We introduce SeaSplat, a method to enable real-time rendering of underwater scenes leveraging recent advances in 3D radiance fields.
Applying SeaSplat to the real-world scenes from SeaThru-NeRF dataset, a scene collected by an underwater vehicle in the US Virgin Islands.
We show that the underwater image formation helps learn scene structure, with better depth maps, as well as show that our improvements maintain the significant computational improvements afforded by leveraging a 3D Gaussian representation.
arXiv Detail & Related papers (2024-09-25T20:45:19Z) - Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text [61.9973218744157]
We introduce Director3D, a robust open-world text-to-3D generation framework, designed to generate both real-world 3D scenes and adaptive camera trajectories.
Experiments demonstrate that Director3D outperforms existing methods, offering superior performance in real-world 3D generation.
arXiv Detail & Related papers (2024-06-25T14:42:51Z) - Ghost on the Shell: An Expressive Representation of General 3D Shapes [97.76840585617907]
Meshes are appealing since they enable fast physics-based rendering with realistic material and lighting.
Recent work on reconstructing and statistically modeling 3D shapes has critiqued meshes as being topologically inflexible.
We parameterize open surfaces by defining a manifold signed distance field on watertight surfaces.
G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks.
arXiv Detail & Related papers (2023-10-23T17:59:52Z) - ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image
Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging.
We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild.
We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z) - Underwater Light Field Retention : Neural Rendering for Underwater
Imaging [6.22867695581195]
Underwater Image Rendering aims to generate a true-to-life underwater image from a given clean one.
We propose a neural rendering method for underwater imaging, dubbed UWNR (Underwater Neural Rendering).
arXiv Detail & Related papers (2022-03-21T14:22:05Z) - 3D-Aware Semantic-Guided Generative Model for Human Synthesis [67.86621343494998]
This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis.
Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.
arXiv Detail & Related papers (2021-12-02T17:10:53Z) - Generating Physically-Consistent Satellite Imagery for Climate Visualizations [53.61991820941501]
We train a generative adversarial network to create synthetic satellite imagery of future flooding and reforestation events.
A pure deep learning-based model can generate flood visualizations but hallucinates floods at locations that were not susceptible to flooding.
We publish our code and dataset for segmentation guided image-to-image translation in Earth observation.
arXiv Detail & Related papers (2021-04-10T15:00:15Z) - SMPLpix: Neural Avatars from 3D Human Models [56.85115800735619]
We bridge the gap between classic rendering and the latest generative networks operating in pixel space.
We train a network that directly converts a sparse set of 3D mesh vertices into photorealistic images.
We show the advantage over conventional differentiables both in terms of the level of photorealism and rendering efficiency.
arXiv Detail & Related papers (2020-08-16T10:22:00Z) - Deep Sea Robotic Imaging Simulator [6.2122699483618]
The largest portion of the ocean - the deep sea - still remains mostly unexplored.
Deep sea images are very different from the images taken in shallow waters and this area did not get much attention from the community.
This paper presents a physical model-based image simulation solution, which uses an in-air texture and depth information as inputs.
arXiv Detail & Related papers (2020-06-27T16:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.