Learning Omnidirectional Flow in 360-degree Video via Siamese
Representation
- URL: http://arxiv.org/abs/2208.03620v1
- Date: Sun, 7 Aug 2022 02:24:30 GMT
- Title: Learning Omnidirectional Flow in 360-degree Video via Siamese
Representation
- Authors: Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan
Yan
- Abstract summary: This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360.
We present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF)
Experiments verify the proposed framework's effectiveness and show up to 40% performance improvement over the state-of-the-art approaches.
- Score: 11.421244426346389
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Optical flow estimation in omnidirectional videos faces two significant
issues: the lack of benchmark datasets and the challenge of adapting
perspective video-based methods to accommodate the omnidirectional nature. This
paper proposes the first perceptually natural-synthetic omnidirectional
benchmark dataset with a 360-degree field of view, FLOW360, with 40 different
videos and 4,000 video frames. We conduct comprehensive characteristic analysis
and comparisons between our dataset and existing optical flow datasets, which
manifest perceptual realism, uniqueness, and diversity. To accommodate the
omnidirectional nature, we present a novel Siamese representation Learning
framework for Omnidirectional Flow (SLOF). We train our network in a
contrastive manner with a hybrid loss function that combines contrastive loss
and optical flow loss. Extensive experiments verify the proposed framework's
effectiveness and show up to 40% performance improvement over the
state-of-the-art approaches. Our FLOW360 dataset and code are available at
https://siamlof.github.io/.
Related papers
- Optical-Flow Guided Prompt Optimization for Coherent Video Generation [51.430833518070145]
We propose a framework called MotionPrompt that guides the video generation process via optical flow.
We optimize learnable token embeddings during reverse sampling steps by using gradients from a trained discriminator applied to random frame pairs.
This approach allows our method to generate visually coherent video sequences that closely reflect natural motion dynamics, without compromising the fidelity of the generated content.
arXiv Detail & Related papers (2024-11-23T12:26:52Z) - 360VFI: A Dataset and Benchmark for Omnidirectional Video Frame Interpolation [13.122586587748218]
This paper introduces the benchmark dataset, 360VFI, for Omnidirectional Video Frame Interpolation.
We present a practical implementation that introduces a distortion prior from omnidirectional video into the network to modulate distortions.
arXiv Detail & Related papers (2024-07-19T06:50:24Z) - BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement [56.97766265018334]
This paper introduces a low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels.
Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets.
arXiv Detail & Related papers (2024-07-03T22:41:49Z) - Skin the sheep not only once: Reusing Various Depth Datasets to Drive
the Learning of Optical Flow [25.23550076996421]
We propose to leverage the geometric connection between optical flow estimation and stereo matching.
We turn the monocular depth datasets into stereo ones via virtual disparity.
We also introduce virtual camera motion into stereo data to produce additional flows along the vertical direction.
arXiv Detail & Related papers (2023-10-03T06:56:07Z) - Spherical Vision Transformer for 360-degree Video Saliency Prediction [17.948179628551376]
We propose a vision-transformer-based model for omnidirectional videos named SalViT360.
We introduce a spherical geometry-aware self-attention mechanism that is capable of effective omnidirectional video understanding.
Our approach is the first to employ tangent images for omnidirectional saliency prediction prediction, and our experimental results on three ODV saliency datasets demonstrate its effectiveness compared to the state-of-the-art.
arXiv Detail & Related papers (2023-08-24T18:07:37Z) - Optical Flow Estimation in 360$^\circ$ Videos: Dataset, Model and
Application [9.99133340779672]
We propose the first perceptually realistic 360$circ$ filed-of-view video benchmark dataset, namely FLOW360.
We present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF) estimation, which is trained in a contrastive manner.
The learning scheme is further proven to be efficient by expanding our siamese learning scheme and omnidirectional optical flow estimation to the egocentric activity recognition task.
arXiv Detail & Related papers (2023-01-27T17:50:09Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - Deep Video Prior for Video Consistency and Propagation [58.250209011891904]
We present a novel and general approach for blind video temporal consistency.
Our method is only trained on a pair of original and processed videos directly instead of a large dataset.
We show that temporal consistency can be achieved by training a convolutional neural network on a video with Deep Video Prior.
arXiv Detail & Related papers (2022-01-27T16:38:52Z) - Learning optical flow from still images [53.295332513139925]
We introduce a framework to generate accurate ground-truth optical flow annotations quickly and in large amounts from any readily available single real picture.
We virtually move the camera in the reconstructed environment with known motion vectors and rotation angles.
When trained with our data, state-of-the-art optical flow networks achieve superior generalization to unseen real data.
arXiv Detail & Related papers (2021-04-08T17:59:58Z) - Optical Flow Estimation from a Single Motion-blurred Image [66.2061278123057]
Motion blur in an image may have practical interests in fundamental computer vision problems.
We propose a novel framework to estimate optical flow from a single motion-blurred image in an end-to-end manner.
arXiv Detail & Related papers (2021-03-04T12:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.