Related papers: TIDE: Temporally Incremental Disparity Estimation via Pattern Flow in Structured Light System

TIDE: Temporally Incremental Disparity Estimation via Pattern Flow in Structured Light System

URL: http://arxiv.org/abs/2310.08932v1
Date: Fri, 13 Oct 2023 07:55:33 GMT
Title: TIDE: Temporally Incremental Disparity Estimation via Pattern Flow in Structured Light System
Authors: Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha
Abstract summary: TIDE-Net is a learning-based technique for disparity computation in mono-camera structured light systems. We exploit the deformation of projected patterns (named pattern flow) on captured image sequences to model the temporal information. For each incoming frame, our model fuses correlation volumes (from current frame) and disparity (from former frame) warped by pattern flow.
Score: 17.53719804060679
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduced Temporally Incremental Disparity Estimation Network (TIDE-Net), a learning-based technique for disparity computation in mono-camera structured light systems. In our hardware setting, a static pattern is projected onto a dynamic scene and captured by a monocular camera. Different from most former disparity estimation methods that operate in a frame-wise manner, our network acquires disparity maps in a temporally incremental way. Specifically, We exploit the deformation of projected patterns (named pattern flow ) on captured image sequences, to model the temporal information. Notably, this newly proposed pattern flow formulation reflects the disparity changes along the epipolar line, which is a special form of optical flow. Tailored for pattern flow, the TIDE-Net, a recurrent architecture, is proposed and implemented. For each incoming frame, our model fuses correlation volumes (from current frame) and disparity (from former frame) warped by pattern flow. From fused features, the final stage of TIDE-Net estimates the residual disparity rather than the full disparity, as conducted by many previous methods. Interestingly, this design brings clear empirical advantages in terms of efficiency and generalization ability. Using only synthetic data for training, our extensitve evaluation results (w.r.t. both accuracy and efficienty metrics) show superior performance than several SOTA models on unseen real data. The code is available on https://github.com/CodePointer/TIDENet.

Related papers

Taming generative video models for zero-shot optical flow extraction [28.176290134216995]
Self-supervised video models trained only for future frame prediction can be prompted, without fine-tuning, to output flow.<n>Inspired by the Counterfactual World Model (CWM) paradigm, we extend this idea to generative video models.<n> KL-tracing is a novel test-time procedure that injects a localized perturbation into the first frame, rolls out the model one step, and computes the Kullback-Leibler divergence between perturbed and ungenerative predictive distributions.
arXiv Detail & Related papers (2025-07-11T23:59:38Z)
Flow to Learn: Flow Matching on Neural Network Parameters [0.0]
We introduce FLoWN, a flow matching model that learns to generate neural network parameters for different tasks. Experiments verify that FLoWN attains various desiderata for a meta-learning model.
arXiv Detail & Related papers (2025-03-25T05:57:50Z)
STLight: a Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal joint Processing [6.872340834265972]
We propose STLight, a novel method for S-temporal learning that relies solely on channel-wise and depth-wise convolutions as learnable layers. STLight overcomes the limitations of traditional convolutional approaches by rearranging spatial and temporal dimensions together. Our architecture achieves state-of-the-art performance on STL benchmarks across datasets and settings, while significantly improving computational efficiency in terms of parameters and computational FLOPs.
arXiv Detail & Related papers (2024-11-15T13:53:19Z)
TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture. To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer. In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z)
StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences [31.210626775505407]
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation. We present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency to two-frame networks. StreamFlow not only excels in terms of performance on challenging KITTI and Sintel datasets, with particular improvement in occluded areas.
arXiv Detail & Related papers (2023-11-28T07:53:51Z)
Training and Predicting Visual Error for Real-Time Applications [6.687091041822445]
We explore the abilities of convolutional neural networks to predict a variety of visual metrics without requiring either reference or rendered images. Our solution combines image-space information that is readily available in most state-of-the-art deferred shading pipelines with reprojection from previous frames to enable an adequate estimate of visual errors.
arXiv Detail & Related papers (2023-10-13T14:14:00Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
Rethinking Optical Flow from Geometric Matching Consistent Perspective [38.014569953980754]
We propose a rethinking to previous optical flow estimation. We use GIM as a pre-training task for the optical flow estimation (MatchFlow) with better feature representations. Our method achieves 11.5% and 10.1% error reduction from GMA on Sintel clean pass and KITTI test set.
arXiv Detail & Related papers (2023-03-15T06:00:38Z)
AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally Consistent Video Semantic Segmentation [81.87943324048756]
In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy. Existing methods rely on optical flow regularization or fine-tuning with test data to attain temporal consistency. This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models.
arXiv Detail & Related papers (2021-10-24T07:07:41Z)
Consistency Guided Scene Flow Estimation [159.24395181068218]
CGSF is a self-supervised framework for the joint reconstruction of 3D scene structure and motion from stereo video. We show that the proposed model can reliably predict disparity and scene flow in challenging imagery. It achieves better generalization than the state-of-the-art, and adapts quickly and robustly to unseen domains.
arXiv Detail & Related papers (2020-06-19T17:28:07Z)
Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR) Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data. We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.