Related papers: NViSII: A Scriptable Tool for Photorealistic Image Generation

NViSII: A Scriptable Tool for Photorealistic Image Generation

URL: http://arxiv.org/abs/2105.13962v1
Date: Fri, 28 May 2021 16:35:32 GMT
Title: NViSII: A Scriptable Tool for Photorealistic Image Generation
Authors: Nathan Morrical, Jonathan Tremblay, Yunzhi Lin, Stephen Tyree, Stan Birchfield, Valerio Pascucci, Ingo Wald
Abstract summary: We present a Python-based built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images. Our tool enables the description and manipulation of complex dynamic 3D scenes.
Score: 21.453677837017462
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.

Related papers

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans [64.31686158593351]
LiteReality is a novel pipeline that converts RGB-D scans of indoor environments into compact, realistic, and interactive 3D virtual replicas.<n> LiteReality supports key features essential for graphics pipelines -- such as object individuality, articulation, high-quality rendering materials, and physically based interaction.<n>We demonstrate the effectiveness of LiteReality on both real-life scans and public datasets.
arXiv Detail & Related papers (2025-07-03T17:59:55Z)
Automating 3D Dataset Generation with Neural Radiance Fields [0.0]
Training performant detection models require diverse, precisely annotated, and large scale datasets. We propose a pipeline for automatic generation of 3D datasets for arbitrary objects. Our pipeline is fast, easy to use and has a high degree of automation.
arXiv Detail & Related papers (2025-03-20T10:01:32Z)
Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data [7.380444448047908]
We introduce a novel method for fast, controllable, and background-free shadow generation for 2D object images. We create a large synthetic dataset using a 3D rendering engine to train a diffusion model for controllable shadow generation. We find that rectified flow objective achieves high-quality results with just a single sampling step enabling real-time applications.
arXiv Detail & Related papers (2024-12-16T16:55:22Z)
Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics [50.23625950905638]
We present a new dataset for textured mesh saliency, created through an innovative eye-tracking experiment in a six degrees of freedom (6-DOF) VR environment. Our proposed model predicts saliency maps for textured mesh surfaces by treating each triangular face as an individual unit and assigning a saliency density value to reflect the importance of each local surface region.
arXiv Detail & Related papers (2024-12-11T08:27:33Z)
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation [2.3213238782019316]
GIMDiffusion is a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images. We exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models.
arXiv Detail & Related papers (2024-09-05T17:21:54Z)
AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space. We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z)
PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing [34.76980642388534]
Lighting effects such as shadows or reflections are key in making synthetic images realistic and visually appealing. To generate such effects, traditional computer graphics uses a physically-based along with 3D geometry. Recent deep learning-based approaches introduced a pixel height representation to generate soft shadows and reflections. We introduce PixHt-Lab, a system leveraging an explicit mapping from pixel height representation to 3D space.
arXiv Detail & Related papers (2023-02-28T23:52:01Z)
PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes [84.66946637534089]
PhotoScene is a framework that takes input image(s) of a scene and builds a photorealistic digital twin with high-quality materials and similar lighting. We model scene materials using procedural material graphs; such graphs represent photorealistic and resolution-independent materials. We evaluate our technique on objects and layout reconstructions from ScanNet, SUN RGB-D and stock photographs, and demonstrate that our method reconstructs high-quality, fully relightable 3D scenes.
arXiv Detail & Related papers (2022-07-02T06:52:44Z)
AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis [78.17671694498185]
We propose AUV-Net which learns to embed 3D surfaces into a 2D aligned UV space. As a result, textures are aligned across objects, and can thus be easily synthesized by generative models of images. The learned UV mapping and aligned texture representations enable a variety of applications including texture transfer, texture synthesis, and textured single view 3D reconstruction.
arXiv Detail & Related papers (2022-04-06T21:39:24Z)
Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach [1.3359609092684614]
In recent years, photogrammetry has been widely used in many areas to create 3D virtual data representing the physical environment. These cutting-edge technologies have caught the US Army and Navy's attention for the purpose of rapid 3D battlefield reconstruction, virtual training, and simulations.
arXiv Detail & Related papers (2021-09-24T22:29:26Z)
Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting [149.1673041605155]
We address the problem of jointly estimating albedo, normals, depth and 3D spatially-varying lighting from a single image. Most existing methods formulate the task as image-to-image translation, ignoring the 3D properties of the scene. We propose a unified, learning-based inverse framework that formulates 3D spatially-varying lighting.
arXiv Detail & Related papers (2021-09-13T15:29:03Z)
Deep Direct Volume Rendering: Learning Visual Feature Mappings From Exemplary Images [57.253447453301796]
We introduce Deep Direct Volume Rendering (DeepDVR), a generalization of Direct Volume Rendering (DVR) that allows for the integration of deep neural networks into the DVR algorithm. We conceptualize the rendering in a latent color space, thus enabling the use of deep architectures to learn implicit mappings for feature extraction and classification. Our generalization serves to derive novel volume rendering architectures that can be trained end-to-end directly from examples in image space.
arXiv Detail & Related papers (2021-06-09T23:03:00Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.