Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
- URL: http://arxiv.org/abs/2312.00362v2
- Date: Mon, 15 Apr 2024 11:03:06 GMT
- Title: Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
- Authors: Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li,
- Abstract summary: We provide the first systematic study of video distillation and introduce a taxonomy to categorize temporal compression.
Our investigation reveals that the temporal information is usually not well learned during distillation, and the temporal dimension of synthetic data contributes little.
Our method achieves state-of-the-art on video datasets at different scales, with a notably smaller memory storage budget.
- Score: 56.26688591324508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, dataset distillation has paved the way towards efficient machine learning, especially for image datasets. However, the distillation for videos, characterized by an exclusive temporal dimension, remains an underexplored domain. In this work, we provide the first systematic study of video distillation and introduce a taxonomy to categorize temporal compression. Our investigation reveals that the temporal information is usually not well learned during distillation, and the temporal dimension of synthetic data contributes little. The observations motivate our unified framework of disentangling the dynamic and static information in the videos. It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block. Our method achieves state-of-the-art on video datasets at different scales, with a notably smaller memory storage budget. Our code is available at https://github.com/yuz1wan/video_distillation.
Related papers
- Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos [13.687045169487774]
We investigate streaming self-supervised learning from long-form real-world egocentric video streams.
Inspired by the event segmentation mechanism in human perception and memory, we propose "Memory Storyboard"
To accommodate efficient temporal segmentation, we propose a two-tier memory hierarchy.
arXiv Detail & Related papers (2025-01-21T16:19:38Z) - Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation [32.11635464720755]
We propose to decouple video-level referring expression understanding into static and motion perception.
We employ contrastive learning to distinguish the motions of visually similar objects.
These contributions yield state-of-the-art performance across five datasets.
arXiv Detail & Related papers (2024-04-04T17:58:21Z) - Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention [29.62044843067169]
Video object segmentation is a fundamental research problem in computer vision.
We propose a new method for self-supervised video object segmentation based on distillation learning of deformable attention.
arXiv Detail & Related papers (2024-01-25T04:39:48Z) - Just a Glimpse: Rethinking Temporal Information for Video Continual
Learning [58.7097258722291]
We propose a novel replay mechanism for effective video continual learning based on individual/single frames.
Under extreme memory constraints, video diversity plays a more significant role than temporal information.
Our method achieves state-of-the-art performance, outperforming the previous state-of-the-art by up to 21.49%.
arXiv Detail & Related papers (2023-05-28T19:14:25Z) - Time Is MattEr: Temporal Self-supervision for Video Transformers [72.42240984211283]
We design simple yet effective self-supervised tasks for video models to learn temporal dynamics better.
Our method learns the temporal order of video frames as extra self-supervision and enforces the randomly shuffled frames to have low-confidence outputs.
Under various video action recognition tasks, we demonstrate the effectiveness of our method and its compatibility with state-of-the-art Video Transformers.
arXiv Detail & Related papers (2022-07-19T04:44:08Z) - Video Demoireing with Relation-Based Temporal Consistency [68.20281109859998]
Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.
We study how to remove such undesirable moire patterns in videos, namely video demoireing.
arXiv Detail & Related papers (2022-04-06T17:45:38Z) - Self-Conditioned Probabilistic Learning of Video Rescaling [70.10092286301997]
We propose a self-conditioned probabilistic framework for video rescaling to learn the paired downscaling and upscaling procedures simultaneously.
We decrease the entropy of the information lost in the downscaling by maximizing its conditioned probability on the strong spatial-temporal prior information.
We extend the framework to a lossy video compression system, in which a gradient estimator for non-differential industrial lossy codecs is proposed.
arXiv Detail & Related papers (2021-07-24T15:57:15Z) - Efficient data-driven encoding of scene motion using Eccentricity [0.993963191737888]
This paper presents a novel approach of representing dynamic visual scenes with static maps generated from video/image streams.
The maps are 2D matrices calculated in a pixel-wise manner, that is based on the concept of Eccentricity data analysis.
The list of potential applications includes video-based activity recognition, intent recognition, object tracking, video description.
arXiv Detail & Related papers (2021-03-03T23:11:21Z) - DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal
Fusion [67.64047158294062]
We propose an online multi-view depth prediction approach on posed video streams.
The scene geometry information computed in the previous time steps is propagated to the current time step.
We outperform the existing state-of-the-art multi-view stereo methods on most of the evaluated metrics.
arXiv Detail & Related papers (2020-12-03T18:54:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.