3One2: One-step Regression Plus One-step Diffusion for One-hot Modulation in Dual-path Video Snapshot Compressive Imaging
- URL: http://arxiv.org/abs/2512.17578v1
- Date: Fri, 19 Dec 2025 13:44:36 GMT
- Title: 3One2: One-step Regression Plus One-step Diffusion for One-hot Modulation in Dual-path Video Snapshot Compressive Imaging
- Authors: Ge Wang, Xing Liu, Xin Yuan,
- Abstract summary: Video snapshot imaging (SCI) captures dynamic scene sequences through a two-dimensional (2D) snapshot.<n>One-hot modulation, activating only one sub-frame per pixel, provides a promising solution for achieving perfect temporal decoupling.<n>We propose an algorithm specifically designed for one-hot masks.
- Score: 15.082139132074294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video snapshot compressive imaging (SCI) captures dynamic scene sequences through a two-dimensional (2D) snapshot, fundamentally relying on optical modulation for hardware compression and the corresponding software reconstruction. While mainstream video SCI using random binary modulation has demonstrated success, it inevitably results in temporal aliasing during compression. One-hot modulation, activating only one sub-frame per pixel, provides a promising solution for achieving perfect temporal decoupling, thereby alleviating issues associated with aliasing. However, no algorithms currently exist to fully exploit this potential. To bridge this gap, we propose an algorithm specifically designed for one-hot masks. First, leveraging the decoupling properties of one-hot modulation, we transform the reconstruction task into a generative video inpainting problem and introduce a stochastic differential equation (SDE) of the forward process that aligns with the hardware compression process. Next, we identify limitations of the pure diffusion method for video SCI and propose a novel framework that combines one-step regression initialization with one-step diffusion refinement. Furthermore, to mitigate the spatial degradation caused by one-hot modulation, we implement a dual optical path at the hardware level, utilizing complementary information from another path to enhance the inpainted video. To our knowledge, this is the first work integrating diffusion into video SCI reconstruction. Experiments conducted on synthetic datasets and real scenes demonstrate the effectiveness of our method.
Related papers
- ReLumix: Extending Image Relighting to Video via Video Diffusion Models [5.890782804843724]
Controlling illumination during video post-production is a crucial yet elusive goal in computational photography.<n>This paper introduces ReLumix, a novel framework that decouples the relighting from temporal synthesis.<n>Although trained on synthetic data, ReLumix shows competitive generalization to real-world videos.
arXiv Detail & Related papers (2025-09-28T09:35:33Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [92.4205087439928]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose the Self-supervised Transfer (PST) and the FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models, effectively mitigating data scarcity.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.<n>This combined approach enables FUSE to construct a universal image-event that only requires lightweight decoder adaptation for target datasets.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [95.98801201266099]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.<n>We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.<n>Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler [53.98558445900626]
Current image-to-video diffusion models, while powerful in generating videos from a single frame, need adaptation for two-frame conditioned generation.<n>We introduce a novel, bidirectional sampling strategy to address these off-manifold issues without requiring extensive re-noising or fine-tuning.<n>Our method employs sequential sampling along both forward and backward paths, conditioned on the start and end frames, respectively, ensuring more coherent and on-manifold generation of intermediate frames.
arXiv Detail & Related papers (2024-10-08T03:01:54Z) - Deep Optics for Video Snapshot Compressive Imaging [10.830072985735175]
Video snapshot imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector.
This paper presents a framework to jointly optimize masks and a reconstruction network.
We believe this is a milestone for real-world video SCI.
arXiv Detail & Related papers (2024-04-08T08:04:44Z) - Unfolding Framework with Prior of Convolution-Transformer Mixture and
Uncertainty Estimation for Video Snapshot Compressive Imaging [7.601695814245209]
We consider the problem of video snapshot compressive imaging (SCI), where sequential high-speed frames are modulated by different masks and captured by a single measurement.
By combining optimization algorithms and neural networks, deep unfolding networks (DUNs) score tremendous achievements in solving inverse problems.
arXiv Detail & Related papers (2023-06-20T06:25:48Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral
Compressive Imaging [142.11622043078867]
We propose a principled Degradation-Aware Unfolding Framework (DAUF) that estimates parameters from the compressed image and physical mask, and then uses these parameters to control each iteration.
By plugging HST into DAUF, we establish the first Transformer-based deep unfolding method, Degradation-Aware Unfolding Half-Shuffle Transformer (DAUHST) for HSI reconstruction.
arXiv Detail & Related papers (2022-05-20T11:37:44Z) - Dual-view Snapshot Compressive Imaging via Optical Flow Aided Recurrent
Neural Network [14.796204921975733]
Dual-view snapshot compressive imaging (SCI) aims to capture videos from two field-of-views (FoVs) in a single snapshot.
It is challenging for existing model-based decoding algorithms to reconstruct each individual scene.
We propose an optical flow-aided recurrent neural network for dual video SCI systems, which provides high-quality decoding in seconds.
arXiv Detail & Related papers (2021-09-11T14:24:44Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.