A Real-time 3D Desktop Display
- URL: http://arxiv.org/abs/2506.08064v1
- Date: Mon, 09 Jun 2025 10:55:15 GMT
- Title: A Real-time 3D Desktop Display
- Authors: Livio Tenze, Enrique Canessa,
- Abstract summary: altiro3D aims to deal with 3D video streams from either 2D webcam images or flat video files.<n>The core function needed to recreate multiviews consists on the use of MiDaS Convolutional Neural Network (CNN)<n>In order to simplify the acquisition of a Desktop screen area by the user, a multi-platform Graphical User Interface has been also implemented.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A new extended version of the altiro3D C++ Library -- initially developed to get glass-free holographic displays starting from 2D images -- is here introduced aiming to deal with 3D video streams from either 2D webcam images or flat video files. These streams are processed in real-time to synthesize light-fields (in Native format) and feed realistic 3D experiences. The core function needed to recreate multiviews consists on the use of MiDaS Convolutional Neural Network (CNN), which allows to extract a depth map from a single 2D image. Artificial Intelligence (AI) computing techniques are applied to improve the overall performance of the extended altiro3D Library. Thus, altiro3D can now treat standard images, video streams or screen portions of a Desktop where other apps may be also running (like web browsers, video chats, etc) and render them into 3D. To achieve the latter, a screen region need to be selected in order to feed the output directly into a light-field 3D device such as Looking Glass (LG) Portrait. In order to simplify the acquisition of a Desktop screen area by the user, a multi-platform Graphical User Interface has been also implemented. Sources available at: https://github.com/canessae/altiro3D/releases/tag/2.0.0
Related papers
- Bolt3D: Generating 3D Scenes in Seconds [77.592919825037]
Given one or more images, our model Bolt3D directly samples a 3D scene representation in less than seven seconds on a single GPU.<n>Compared to prior multiview generative models that require per-scene optimization for 3D reconstruction, Bolt3D reduces the inference cost by a factor of up to 300 times.
arXiv Detail & Related papers (2025-03-18T17:24:19Z) - You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale [42.67300636733286]
We present See3D, a visual-conditional multi-view diffusion model trained on large-scale Internet videos for open-world 3D creation.<n>The model aims to Get 3D knowledge by solely Seeing the visual contents from the vast and rapidly growing video data.<n>Our numerical and visual comparisons on single and sparse reconstruction benchmarks show that See3D, trained on cost-effective and scalable video data, achieves notable zero-shot and open-world generation capabilities.
arXiv Detail & Related papers (2024-12-09T17:44:56Z) - EmbodiedSAM: Online Segment Any 3D Thing in Real Time [61.2321497708998]
Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration.<n>An online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed.
arXiv Detail & Related papers (2024-08-21T17:57:06Z) - CAT3D: Create Anything in 3D with Multi-View Diffusion Models [87.80820708758317]
We present CAT3D, a method for creating anything in 3D by simulating this real-world capture process with a multi-view diffusion model.
CAT3D can create entire 3D scenes in as little as one minute, and outperforms existing methods for single image and few-view 3D scene creation.
arXiv Detail & Related papers (2024-05-16T17:59:05Z) - WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space [77.92350895927922]
We propose WildFusion, a new approach to 3D-aware image synthesis based on latent diffusion models (LDMs)
Our 3D-aware LDM is trained without any direct supervision from multiview images or 3D geometry.
This opens up promising research avenues for scalable 3D-aware image synthesis and 3D content creation from in-the-wild image data.
arXiv Detail & Related papers (2023-11-22T18:25:51Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm [111.16358607889609]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.<n>For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - Uni3D: Exploring Unified 3D Representation at Scale [66.26710717073372]
We present Uni3D, a 3D foundation model to explore the unified 3D representation at scale.
Uni3D uses a 2D ViT end-to-end pretrained to align the 3D point cloud features with the image-text aligned features.
We show that the strong Uni3D representation also enables applications such as 3D painting and retrieval in the wild.
arXiv Detail & Related papers (2023-10-10T16:49:21Z) - altiro3D: Scene representation from single image and novel view
synthesis [0.0]
altiro3D is library developed to represent reality starting from a given original RGB image or flat video.
It allows to generate a light-field (or Native) image or video and get a realistic 3D experience.
arXiv Detail & Related papers (2023-04-02T16:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.