3D Reconstruction from Transient Measurements with Time-Resolved Transformer
- URL: http://arxiv.org/abs/2510.09205v1
- Date: Fri, 10 Oct 2025 09:44:08 GMT
- Title: 3D Reconstruction from Transient Measurements with Time-Resolved Transformer
- Authors: Yue Li, Shida Sun, Yu Hong, Feihu Xu, Zhiwei Xiong,
- Abstract summary: We propose a generic Time-Resolved Transformer (TRT) architecture to boost 3D reconstruction performance in photon-efficient imaging.<n>In this paper, we develop two task-specific embodiments: TRT-LOS for imaging and TRT-NLOS for NLOS imaging.<n>In addition, we contribute a large-scale, high-resolution synthetic LOS dataset with various noise levels and capture a set of real-world NLOS imaging measurements.
- Score: 48.73999376279579
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Transient measurements, captured by the timeresolved systems, are widely employed in photon-efficient reconstruction tasks, including line-of-sight (LOS) and non-line-of-sight (NLOS) imaging. However, challenges persist in their 3D reconstruction due to the low quantum efficiency of sensors and the high noise levels, particularly for long-range or complex scenes. To boost the 3D reconstruction performance in photon-efficient imaging, we propose a generic Time-Resolved Transformer (TRT) architecture. Different from existing transformers designed for high-dimensional data, TRT has two elaborate attention designs tailored for the spatio-temporal transient measurements. Specifically, the spatio-temporal self-attention encoders explore both local and global correlations within transient data by splitting or downsampling input features into different scales. Then, the spatio-temporal cross attention decoders integrate the local and global features in the token space, resulting in deep features with high representation capabilities. Building on TRT, we develop two task-specific embodiments: TRT-LOS for LOS imaging and TRT-NLOS for NLOS imaging. Extensive experiments demonstrate that both embodiments significantly outperform existing methods on synthetic data and real-world data captured by different imaging systems. In addition, we contribute a large-scale, high-resolution synthetic LOS dataset with various noise levels and capture a set of real-world NLOS measurements using a custom-built imaging system, enhancing the data diversity in this field. Code and datasets are available at https://github.com/Depth2World/TRT.
Related papers
- Multispectral-NeRF:a multispectral modeling approach based on neural radiance fields [3.606065291262699]
3D reconstruction techniques based on 2D images typically rely on RGB spectral information.<n>Additional spectral bands beyond RGB have been increasingly incorporated into 3D reconstruction.<n>Existing methods that integrate these expanded spectral data often suffer from expensive scheme prices, low accuracy and poor geometric features.<n>We propose Multispectral-NeRF, an enhanced neural architecture derived from NeRF that can effectively integrate multispectral information.
arXiv Detail & Related papers (2025-09-14T09:04:35Z) - Accelerating 3D Photoacoustic Computed Tomography with End-to-End Physics-Aware Neural Operators [74.65171736966131]
Photoacoustic computed tomography (PACT) combines optical contrast with ultrasonic resolution, achieving deep-tissue imaging beyond the optical diffusion limit.<n>Current implementations require dense transducer arrays and prolonged acquisition times, limiting clinical translation.<n>We introduce Pano, an end-to-end physics-aware model that directly learns the inverse acoustic mapping from sensor measurements to volumetric reconstructions.
arXiv Detail & Related papers (2025-09-11T23:12:55Z) - Optimized Sampling for Non-Line-of-Sight Imaging Using Modified Fast Fourier Transforms [6.866110149269]
Non-line-of-Sight (NLOS) imaging systems collect light at a diffuse relay surface and input this measurement into computational algorithms that output a 3D reconstruction.<n>These algorithms utilize the Fast Fourier Transform (FFT) to accelerate the reconstruction process but require both input and output to be sampled spatially with uniform grids.<n>In this work, we demonstrate that existing NLOS imaging setups typically oversample the relay surface spatially, explaining why the measurement can be compressed without sacrificing reconstruction quality.
arXiv Detail & Related papers (2025-01-09T13:52:30Z) - Learning Radiance Fields from a Single Snapshot Compressive Image [18.548244681485922]
Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene structure from a single temporal compressed image.<n>We propose SCINeRF, in which we formulate the physical imaging process of SCI as part of the training of NeRF.<n>We further integrate the popular 3D Gaussian Splatting (3DGS) framework and propose SCISplat to improve 3D scene reconstruction quality and training/rendering speed.
arXiv Detail & Related papers (2024-12-27T06:40:44Z) - T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.<n>We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - Deep Learning-based Cross-modal Reconstruction of Vehicle Target from Sparse 3D SAR Image [6.499547636078961]
We introduce cross-modal learning and propose a Cross-Modal 3D-SAR Reconstruction Network (CMAR-Net) for enhancing sparse 3D SAR images of vehicle targets by fusing optical information.<n>CMAR-Net achieves efficient training and reconstructs sparse 3D SAR images, which are derived from highly sparse-aspect observations, into visually structured 3D vehicle images.
arXiv Detail & Related papers (2024-06-06T15:18:59Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - FocDepthFormer: Transformer with latent LSTM for Depth Estimation from Focal Stack [11.433602615992516]
We present a novel Transformer-based network, FocDepthFormer, which integrates a Transformer with an LSTM module and a CNN decoder.<n>By incorporating the LSTM, FocDepthFormer can be pre-trained on large-scale monocular RGB depth estimation datasets.<n>Our model outperforms state-of-the-art approaches across multiple evaluation metrics.
arXiv Detail & Related papers (2023-10-17T11:53:32Z) - A Deep Learning Approach for SAR Tomographic Imaging of Forested Areas [10.477070348391079]
We show that light-weight neural networks can be trained to perform the tomographic inversion with a single feed-forward pass.
We train our encoder-decoder network using simulated data and validate our technique on real L-band and P-band data.
arXiv Detail & Related papers (2023-01-20T14:34:03Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.