FVV Live: A real-time free-viewpoint video system with consumer
electronics hardware
- URL: http://arxiv.org/abs/2007.00558v1
- Date: Wed, 1 Jul 2020 15:40:28 GMT
- Title: FVV Live: A real-time free-viewpoint video system with consumer
electronics hardware
- Authors: Pablo Carballeira, Carlos Carmona, C\'esar D\'iaz, Daniel Berj\'on,
Daniel Corregidor, Juli\'an Cabrera, Francisco Mor\'an, Carmen Doblado,
Sergio Arnaldo, Mar\'ia del Mar Mart\'in, Narciso Garc\'ia
- Abstract summary: FVV Live is a novel end-to-end free-viewpoint video system, designed for low cost and real-time operation.
The system has been designed to yield high-quality free-viewpoint video using consumer-grade cameras and hardware.
- Score: 1.1403672224109256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: FVV Live is a novel end-to-end free-viewpoint video system, designed for low
cost and real-time operation, based on off-the-shelf components. The system has
been designed to yield high-quality free-viewpoint video using consumer-grade
cameras and hardware, which enables low deployment costs and easy installation
for immersive event-broadcasting or videoconferencing.
The paper describes the architecture of the system, including acquisition and
encoding of multiview plus depth data in several capture servers and virtual
view synthesis on an edge server. All the blocks of the system have been
designed to overcome the limitations imposed by hardware and network, which
impact directly on the accuracy of depth data and thus on the quality of
virtual view synthesis. The design of FVV Live allows for an arbitrary number
of cameras and capture servers, and the results presented in this paper
correspond to an implementation with nine stereo-based depth cameras.
FVV Live presents low motion-to-photon and end-to-end delays, which enables
seamless free-viewpoint navigation and bilateral immersive communications.
Moreover, the visual quality of FVV Live has been assessed through subjective
assessment with satisfactory results, and additional comparative tests show
that it is preferred over state-of-the-art DIBR alternatives.
Related papers
- CLIPVQA:Video Quality Assessment via CLIP [56.94085651315878]
We propose an efficient CLIP-based Transformer method for the VQA problem ( CLIPVQA)
The proposed CLIPVQA achieves new state-of-the-art VQA performance and up to 37% better generalizability than existing benchmark VQA methods.
arXiv Detail & Related papers (2024-07-06T02:32:28Z) - BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement [56.97766265018334]
This paper introduces a low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels.
Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets.
arXiv Detail & Related papers (2024-07-03T22:41:49Z) - E2HQV: High-Quality Video Generation from Event Camera via
Theory-Inspired Model-Aided Deep Learning [53.63364311738552]
Bio-inspired event cameras or dynamic vision sensors are capable of capturing per-pixel brightness changes (called event-streams) in high temporal resolution and high dynamic range.
It calls for events-to-video (E2V) solutions which take event-streams as input and generate high quality video frames for intuitive visualization.
We propose textbfE2HQV, a novel E2V paradigm designed to produce high-quality video frames from events.
arXiv Detail & Related papers (2024-01-16T05:10:50Z) - ViFiT: Reconstructing Vision Trajectories from IMU and Wi-Fi Fine Time
Measurements [6.632056181867312]
We propose ViFiT, a transformer-based model that reconstructs vision bounding box trajectories from phone data (IMU and Fine Time Measurements)
ViFiT achieves an MRFR of 0.65 that outperforms the state-of-the-art approach for cross-modal reconstruction in LSTM-Decoder architecture.
arXiv Detail & Related papers (2023-10-04T20:05:40Z) - Streaming Video Model [90.24390609039335]
We propose to unify video understanding tasks into one streaming video architecture, referred to as Streaming Vision Transformer (S-ViT)
S-ViT first produces frame-level features with a memory-enabled temporally-aware spatial encoder to serve frame-based video tasks.
The efficiency and efficacy of S-ViT is demonstrated by the state-of-the-art accuracy in the sequence-based action recognition.
arXiv Detail & Related papers (2023-03-30T08:51:49Z) - Learning to Select Camera Views: Efficient Multiview Understanding at
Few Glances [59.34619548026885]
We propose a view selection approach that analyzes the target object or scenario from given views and selects the next best view for processing.
Our approach features a reinforcement learning based camera selection module, MVSelect, that not only selects views but also facilitates joint training with the task network.
arXiv Detail & Related papers (2023-03-10T18:59:10Z) - PL-EVIO: Robust Monocular Event-based Visual Inertial Odometry with
Point and Line Features [3.6355269783970394]
Event cameras are motion-activated sensors that capture pixel-level illumination changes instead of the intensity image with a fixed frame rate.
We propose a robust, high-accurate, and real-time optimization-based monocular event-based visual-inertial odometry (VIO) method.
arXiv Detail & Related papers (2022-09-25T06:14:12Z) - A Study of Designing Compact Audio-Visual Wake Word Spotting System
Based on Iterative Fine-Tuning in Neural Network Pruning [57.28467469709369]
We investigate on designing a compact audio-visual wake word spotting (WWS) system by utilizing visual information.
We introduce a neural network pruning strategy via the lottery ticket hypothesis in an iterative fine-tuning manner (LTH-IF)
The proposed audio-visual system achieves significant performance improvements over the single-modality (audio-only or video-only) system under different noisy conditions.
arXiv Detail & Related papers (2022-02-17T08:26:25Z) - A Multi-user Oriented Live Free-viewpoint Video Streaming System Based
On View Interpolation [15.575219833681635]
We introduce a CNN-based view algorithm to synthesis dense virtual views in real time.
We also build an end-to-end live free-viewpoint system with a multi-user oriented streaming strategy.
arXiv Detail & Related papers (2021-12-20T15:17:57Z) - A Generic Object Re-identification System for Short Videos [39.662850217144964]
A Temporal Information Fusion Network (TIFN) is proposed in the object detection module.
A Cross-Layer Pointwise Siamese Network (CPSN) is proposed in the tracking module to enhance the robustness of the appearance model.
Two challenge datasets containing real-world short videos are built for video object trajectory extraction and generic object re-identification.
arXiv Detail & Related papers (2021-02-10T05:45:09Z) - FVV Live: Real-Time, Low-Cost, Free Viewpoint Video [1.2752808844888017]
FVV Live is a novel real-time, low-latency, end-to-end free viewpoint system.
System has been specially designed for low-cost and real-time operation.
arXiv Detail & Related papers (2020-06-30T15:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.