TIDE: Temporally Incremental Disparity Estimation via Pattern Flow in
Structured Light System
- URL: http://arxiv.org/abs/2310.08932v1
- Date: Fri, 13 Oct 2023 07:55:33 GMT
- Title: TIDE: Temporally Incremental Disparity Estimation via Pattern Flow in
Structured Light System
- Authors: Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha
- Abstract summary: TIDE-Net is a learning-based technique for disparity computation in mono-camera structured light systems.
We exploit the deformation of projected patterns (named pattern flow) on captured image sequences to model the temporal information.
For each incoming frame, our model fuses correlation volumes (from current frame) and disparity (from former frame) warped by pattern flow.
- Score: 17.53719804060679
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduced Temporally Incremental Disparity Estimation Network (TIDE-Net),
a learning-based technique for disparity computation in mono-camera structured
light systems. In our hardware setting, a static pattern is projected onto a
dynamic scene and captured by a monocular camera. Different from most former
disparity estimation methods that operate in a frame-wise manner, our network
acquires disparity maps in a temporally incremental way. Specifically, We
exploit the deformation of projected patterns (named pattern flow ) on captured
image sequences, to model the temporal information. Notably, this newly
proposed pattern flow formulation reflects the disparity changes along the
epipolar line, which is a special form of optical flow. Tailored for pattern
flow, the TIDE-Net, a recurrent architecture, is proposed and implemented. For
each incoming frame, our model fuses correlation volumes (from current frame)
and disparity (from former frame) warped by pattern flow. From fused features,
the final stage of TIDE-Net estimates the residual disparity rather than the
full disparity, as conducted by many previous methods. Interestingly, this
design brings clear empirical advantages in terms of efficiency and
generalization ability. Using only synthetic data for training, our extensitve
evaluation results (w.r.t. both accuracy and efficienty metrics) show superior
performance than several SOTA models on unseen real data. The code is available
on https://github.com/CodePointer/TIDENet.
Related papers
- TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video
Sequences [31.210626775505407]
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation.
We present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency to two-frame networks.
StreamFlow not only excels in terms of performance on challenging KITTI and Sintel datasets, with particular improvement in occluded areas.
arXiv Detail & Related papers (2023-11-28T07:53:51Z) - Training and Predicting Visual Error for Real-Time Applications [6.687091041822445]
We explore the abilities of convolutional neural networks to predict a variety of visual metrics without requiring either reference or rendered images.
Our solution combines image-space information that is readily available in most state-of-the-art deferred shading pipelines with reprojection from previous frames to enable an adequate estimate of visual errors.
arXiv Detail & Related papers (2023-10-13T14:14:00Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models.
We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather.
We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z) - Rethinking Optical Flow from Geometric Matching Consistent Perspective [38.014569953980754]
We propose a rethinking to previous optical flow estimation.
We use GIM as a pre-training task for the optical flow estimation (MatchFlow) with better feature representations.
Our method achieves 11.5% and 10.1% error reduction from GMA on Sintel clean pass and KITTI test set.
arXiv Detail & Related papers (2023-03-15T06:00:38Z) - AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally
Consistent Video Semantic Segmentation [81.87943324048756]
In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy.
Existing methods rely on optical flow regularization or fine-tuning with test data to attain temporal consistency.
This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models.
arXiv Detail & Related papers (2021-10-24T07:07:41Z) - PSEUDo: Interactive Pattern Search in Multivariate Time Series with
Locality-Sensitive Hashing and Relevance Feedback [3.347485580830609]
PSEUDo is an adaptive feature learning technique for exploring visual patterns in multi-track sequential data.
Our algorithm features sub-linear training and inference time.
We demonstrate superiority of PSEUDo in terms of efficiency, accuracy, and steerability.
arXiv Detail & Related papers (2021-04-30T13:00:44Z) - Consistency Guided Scene Flow Estimation [159.24395181068218]
CGSF is a self-supervised framework for the joint reconstruction of 3D scene structure and motion from stereo video.
We show that the proposed model can reliably predict disparity and scene flow in challenging imagery.
It achieves better generalization than the state-of-the-art, and adapts quickly and robustly to unseen domains.
arXiv Detail & Related papers (2020-06-19T17:28:07Z) - Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR)
Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data.
We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.