3D Scene Inference from Transient Histograms
- URL: http://arxiv.org/abs/2211.05094v1
- Date: Wed, 9 Nov 2022 18:31:50 GMT
- Title: 3D Scene Inference from Transient Histograms
- Authors: Sacha Jungerman, Atul Ingle, Yin Li, and Mohit Gupta
- Abstract summary: Time-resolved image sensors that capture light at pico-to-nanosecond were once limited to niche applications.
We propose low-cost and low-power imaging modalities that capture scene information from minimal time-resolved image sensors.
- Score: 17.916392079019175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time-resolved image sensors that capture light at pico-to-nanosecond
timescales were once limited to niche applications but are now rapidly becoming
mainstream in consumer devices. We propose low-cost and low-power imaging
modalities that capture scene information from minimal time-resolved image
sensors with as few as one pixel. The key idea is to flood illuminate large
scene patches (or the entire scene) with a pulsed light source and measure the
time-resolved reflected light by integrating over the entire illuminated area.
The one-dimensional measured temporal waveform, called \emph{transient},
encodes both distances and albedoes at all visible scene points and as such is
an aggregate proxy for the scene's 3D geometry. We explore the viability and
limitations of the transient waveforms by themselves for recovering scene
information, and also when combined with traditional RGB cameras. We show that
plane estimation can be performed from a single transient and that using only a
few more it is possible to recover a depth map of the whole scene. We also show
two proof-of-concept hardware prototypes that demonstrate the feasibility of
our approach for compact, mobile, and budget-limited applications.
Related papers
- Real-time 3D-aware Portrait Video Relighting [89.41078798641732]
We present the first real-time 3D-aware method for relighting in-the-wild videos of talking faces based on Neural Radiance Fields (NeRF)
We infer an albedo tri-plane, as well as a shading tri-plane based on a desired lighting condition for each video frame with fast dual-encoders.
Our method runs at 32.98 fps on consumer-level hardware and achieves state-of-the-art results in terms of reconstruction quality, lighting error, lighting instability, temporal consistency and inference speed.
arXiv Detail & Related papers (2024-10-24T01:34:11Z) - EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [76.02450110026747]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution.
We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS.
We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z) - Transientangelo: Few-Viewpoint Surface Reconstruction Using Single-Photon Lidar [8.464054039931245]
Lidar captures 3D scene geometry by emitting pulses of light to a target and recording the speed-of-light time delay of the reflected light.
conventional lidar systems do not output the raw, captured waveforms of backscattered light.
We develop new regularization strategies that improve robustness to photon noise, enabling accurate surface reconstruction with as few as 10 photons per pixel.
arXiv Detail & Related papers (2024-08-22T08:12:09Z) - PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar [25.332440946211236]
3D reconstruction from a single-view is challenging because of the ambiguity from monocular cues and lack of information about occluded regions.
We propose using time-of-flight data captured by a single-photon avalanche diode to overcome these limitations.
We demonstrate that we can reconstruct visible and occluded geometry without data priors or reliance on controlled ambient lighting or scene albedo.
arXiv Detail & Related papers (2023-12-21T18:59:53Z) - Event-based Motion-Robust Accurate Shape Estimation for Mixed
Reflectance Scenes [17.446182782836747]
We present a novel event-based structured light system that enables fast 3D imaging of mixed reflectance scenes with high accuracy.
We use epipolar constraints that intrinsically enable the measured reflections into decomposing diffuse, two-bounce specular, and other multi-bounce reflections.
The resulting system achieves fast and motion-robust reconstructions of mixed reflectance scenes with 500 $mu$m accuracy.
arXiv Detail & Related papers (2023-11-16T08:12:10Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - One-Shot Neural Fields for 3D Object Understanding [112.32255680399399]
We present a unified and compact scene representation for robotics.
Each object in the scene is depicted by a latent code capturing geometry and appearance.
This representation can be decoded for various tasks such as novel view rendering, 3D reconstruction, and stable grasp prediction.
arXiv Detail & Related papers (2022-10-21T17:33:14Z) - Event Guided Depth Sensing [50.997474285910734]
We present an efficient bio-inspired event-camera-driven depth estimation algorithm.
In our approach, we illuminate areas of interest densely, depending on the scene activity detected by the event camera.
We show the feasibility of our approach in a simulated autonomous driving sequences and real indoor environments.
arXiv Detail & Related papers (2021-10-20T11:41:11Z) - Event-based Stereo Visual Odometry [42.77238738150496]
We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig.
We seek to maximize thetemporal consistency of stereo event-based data while using a simple and efficient representation.
arXiv Detail & Related papers (2020-07-30T15:53:28Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.