Reparo: Loss-Resilient Generative Codec for Video Conferencing
- URL: http://arxiv.org/abs/2305.14135v3
- Date: Fri, 04 Oct 2024 19:24:22 GMT
- Title: Reparo: Loss-Resilient Generative Codec for Video Conferencing
- Authors: Tianhong Li, Vibhaalakshmi Sivaraman, Pantea Karimi, Lijie Fan, Mohammad Alizadeh, Dina Katabi,
- Abstract summary: We introduce Reparo, a loss-resilient video conferencing framework based on generative deep learning models.
We show that Reparo outperforms state-of-the-art FEC-based video conferencing solutions in terms of both video quality (measured through PSNR, SSIM, and LPIPS) and the occurrence of video freezes.
- Score: 26.908731561652726
- License:
- Abstract: Packet loss during video conferencing often results in poor quality and video freezing. Retransmitting lost packets is often impractical due to the need for real-time playback, and using Forward Error Correction (FEC) for packet recovery is challenging due to the unpredictable and bursty nature of Internet losses. Excessive redundancy leads to inefficiency and wasted bandwidth, while insufficient redundancy results in undecodable frames, causing video freezes and quality degradation in subsequent frames. We introduce Reparo -- a loss-resilient video conferencing framework based on generative deep learning models to address these issues. Our approach generates missing information when a frame or part of a frame is lost. This generation is conditioned on the data received thus far, considering the model's understanding of how people and objects appear and interact within the visual realm. Experimental results, using publicly available video conferencing datasets, demonstrate that Reparo outperforms state-of-the-art FEC-based video conferencing solutions in terms of both video quality (measured through PSNR, SSIM, and LPIPS) and the occurrence of video freezes.
Related papers
- Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors [54.8852848659663]
Buffer Anytime is a framework for estimation of depth and normal maps (which we call geometric buffers) from video.
We demonstrate high-quality video buffer estimation by leveraging single-image priors with temporal consistency constraints.
arXiv Detail & Related papers (2024-11-26T09:28:32Z) - FrameCorr: Adaptive, Autoencoder-based Neural Compression for Video Reconstruction in Resource and Timing Constrained Network Settings [0.18906710320196732]
Existing video compression methods face difficulties in recovering compressed data when incomplete data is provided.
We introduce FrameCorr, a deep-learning based solution that utilizes previously received data to predict the missing segments of a frame.
arXiv Detail & Related papers (2024-09-04T05:19:57Z) - VCISR: Blind Single Image Super-Resolution with Video Compression
Synthetic Data [18.877077302923713]
We present a video compression-based degradation model to synthesize low-resolution image data in the blind SISR task.
Our proposed image synthesizing method is widely applicable to existing image datasets.
By introducing video coding artifacts to SISR degradation models, neural networks can super-resolve images with the ability to restore video compression degradations.
arXiv Detail & Related papers (2023-11-02T05:24:19Z) - GRACE: Loss-Resilient Real-Time Video through Neural Codecs [31.006987868475683]
In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements.
We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QE) across a wide range of packet losses.
arXiv Detail & Related papers (2023-05-21T03:50:44Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - Recurrent Video Restoration Transformer with Guided Deformable Attention [116.1684355529431]
We propose RVRT, which processes local neighboring frames in parallel within a globally recurrent framework.
RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime.
arXiv Detail & Related papers (2022-06-05T10:36:09Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - Network state Estimation using Raw Video Analysis: vQoS-GAN based
non-intrusive Deep Learning Approach [5.8010446129208155]
vQoS GAN can estimate the network state parameters from the degraded received video data.
A robust and unique design of deep learning network model has been trained with the video data along with data rate and packet loss class labels.
The proposed semi supervised generative adversarial network can additionally reconstruct the degraded video data to its original form for a better end user experience.
arXiv Detail & Related papers (2022-03-22T10:42:19Z) - VRT: A Video Restoration Transformer [126.79589717404863]
Video restoration (e.g., video super-resolution) aims to restore high-quality frames from low-quality frames.
We propose a Video Restoration Transformer (VRT) with parallel frame prediction and long-range temporal dependency modelling abilities.
arXiv Detail & Related papers (2022-01-28T17:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.