SAILenv: Learning in Virtual Visual Environments Made Simple
- URL: http://arxiv.org/abs/2007.08224v2
- Date: Mon, 20 Jul 2020 15:42:02 GMT
- Title: SAILenv: Learning in Virtual Visual Environments Made Simple
- Authors: Enrico Meloni, Luca Pasqualini, Matteo Tiezzi, Marco Gori, Stefano
Melacci
- Abstract summary: We present a novel platform that allows researchers to experiment visual recognition in virtual 3D scenes.
A few lines of code are needed to interface every algorithm with the virtual world, and non-3D-graphics experts can easily customize the 3D environment itself.
Our framework yields pixel-level semantic and instance labeling, depth, and, to the best of our knowledge, it is the only one that provides motion-related information directly inherited from the 3D engine.
- Score: 16.979621213790015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, researchers in Machine Learning algorithms, Computer Vision
scientists, engineers and others, showed a growing interest in 3D simulators as
a mean to artificially create experimental settings that are very close to
those in the real world. However, most of the existing platforms to interface
algorithms with 3D environments are often designed to setup navigation-related
experiments, to study physical interactions, or to handle ad-hoc cases that are
not thought to be customized, sometimes lacking a strong photorealistic
appearance and an easy-to-use software interface. In this paper, we present a
novel platform, SAILenv, that is specifically designed to be simple and
customizable, and that allows researchers to experiment visual recognition in
virtual 3D scenes. A few lines of code are needed to interface every algorithm
with the virtual world, and non-3D-graphics experts can easily customize the 3D
environment itself, exploiting a collection of photorealistic objects. Our
framework yields pixel-level semantic and instance labeling, depth, and, to the
best of our knowledge, it is the only one that provides motion-related
information directly inherited from the 3D engine. The client-server
communication operates at a low level, avoiding the overhead of HTTP-based data
exchanges. We perform experiments using a state-of-the-art object detector
trained on real-world images, showing that it is able to recognize the
photorealistic 3D objects of our environment. The computational burden of the
optical flow compares favourably with the estimation performed using modern
GPU-based convolutional networks or more classic implementations. We believe
that the scientific community will benefit from the easiness and high-quality
of our framework to evaluate newly proposed algorithms in their own customized
realistic conditions.
Related papers
- OmniSCV: An Omnidirectional Synthetic Image Generator for Computer
Vision [5.2178708158547025]
We present a tool for generating datasets of omnidirectional images with semantic and depth information.
These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4.
We include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems.
arXiv Detail & Related papers (2024-01-30T14:40:19Z) - NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization [80.3424839706698]
We present NeurOCS, a framework that uses instance masks 3D boxes as input to learn 3D object shapes by means of differentiable rendering.
Our approach rests on insights in learning a category-level shape prior directly from real driving scenes.
We make critical design choices to learn object coordinates more effectively from an object-centric view.
arXiv Detail & Related papers (2023-05-28T16:18:41Z) - 3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive
Physics under Challenging Scenes [68.66237114509264]
We present a framework capable of learning 3D-grounded visual intuitive physics models from videos of complex scenes with fluids.
We show our model can make long-horizon future predictions by learning from raw images and significantly outperforms models that do not employ an explicit 3D representation space.
arXiv Detail & Related papers (2023-04-22T19:28:49Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - De-rendering 3D Objects in the Wild [21.16153549406485]
We present a weakly supervised method that is able to decompose a single image of an object into shape.
For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process.
In our experiments, we show that the method can successfully de-render 2D images into a 3D representation and generalizes to unseen object categories.
arXiv Detail & Related papers (2022-01-06T23:50:09Z) - Messing Up 3D Virtual Environments: Transferable Adversarial 3D Objects [21.86544028303682]
We study how to craft adversarial 3D objects by altering their textures, using a tool chain composed of easily accessible elements.
We show that it is possible, and indeed simple, to create adversarial objects using off-the-shelf limited surrogates.
We propose a saliency-based attack that intersects the two classes of adversarials in order to focus the alteration to those texture elements that are estimated to be effective in the target engine.
arXiv Detail & Related papers (2021-09-17T11:06:23Z) - Evaluating Continual Learning Algorithms by Generating 3D Virtual
Environments [66.83839051693695]
Continual learning refers to the ability of humans and animals to incrementally learn over time in a given environment.
We propose to leverage recent advances in 3D virtual environments in order to approach the automatic generation of potentially life-long dynamic scenes with photo-realistic appearance.
A novel element of this paper is that scenes are described in a parametric way, thus allowing the user to fully control the visual complexity of the input stream the agent perceives.
arXiv Detail & Related papers (2021-09-16T10:37:21Z) - Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting [149.1673041605155]
We address the problem of jointly estimating albedo, normals, depth and 3D spatially-varying lighting from a single image.
Most existing methods formulate the task as image-to-image translation, ignoring the 3D properties of the scene.
We propose a unified, learning-based inverse framework that formulates 3D spatially-varying lighting.
arXiv Detail & Related papers (2021-09-13T15:29:03Z) - Interactive Annotation of 3D Object Geometry using 2D Scribbles [84.51514043814066]
In this paper, we propose an interactive framework for annotating 3D object geometry from point cloud data and RGB imagery.
Our framework targets naive users without artistic or graphics expertise.
arXiv Detail & Related papers (2020-08-24T21:51:29Z) - Learning Neural Light Transport [28.9247002210861]
We present an approach for learning light transport in static and dynamic 3D scenes using a neural network.
We find that our model is able to produce photorealistic renderings of static and dynamic scenes.
arXiv Detail & Related papers (2020-06-05T13:26:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.