Optical Flow Distillation: Towards Efficient and Stable Video Style
Transfer
- URL: http://arxiv.org/abs/2007.05146v2
- Date: Wed, 22 Jul 2020 07:24:29 GMT
- Title: Optical Flow Distillation: Towards Efficient and Stable Video Style
Transfer
- Authors: Xinghao Chen, Yiman Zhang, Yunhe Wang, Han Shu, Chunjing Xu, Chang Xu
- Abstract summary: This paper proposes to learn a lightweight video style transfer network via knowledge distillation paradigm.
We adopt two teacher networks, one of which takes optical flow during inference while the other does not.
The output difference between these two teacher networks highlights the improvements made by optical flow, which is then adopted to distill the target student network.
- Score: 67.36785832888614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video style transfer techniques inspire many exciting applications on mobile
devices. However, their efficiency and stability are still far from
satisfactory. To boost the transfer stability across frames, optical flow is
widely adopted, despite its high computational complexity, e.g. occupying over
97% inference time. This paper proposes to learn a lightweight video style
transfer network via knowledge distillation paradigm. We adopt two teacher
networks, one of which takes optical flow during inference while the other does
not. The output difference between these two teacher networks highlights the
improvements made by optical flow, which is then adopted to distill the target
student network. Furthermore, a low-rank distillation loss is employed to
stabilize the output of student network by mimicking the rank of input videos.
Extensive experiments demonstrate that our student network without an optical
flow module is still able to generate stable video and runs much faster than
the teacher network.
Related papers
- Optical-Flow Guided Prompt Optimization for Coherent Video Generation [51.430833518070145]
We propose a framework called MotionPrompt that guides the video generation process via optical flow.
We optimize learnable token embeddings during reverse sampling steps by using gradients from a trained discriminator applied to random frame pairs.
This approach allows our method to generate visually coherent video sequences that closely reflect natural motion dynamics, without compromising the fidelity of the generated content.
arXiv Detail & Related papers (2024-11-23T12:26:52Z) - Binarized Low-light Raw Video Enhancement [49.65466843856074]
Deep neural networks have achieved excellent performance on low-light raw video enhancement.
In this paper, we explore the feasibility of applying the extremely compact binary neural network (BNN) to low-light raw video enhancement.
arXiv Detail & Related papers (2024-03-29T02:55:07Z) - Self-Supervised Motion Magnification by Backpropagating Through Optical
Flow [16.80592879244362]
This paper presents a self-supervised method for magnifying subtle motions in video.
We manipulate the video such that its new optical flow is scaled by the desired amount.
We propose a loss function that estimates the optical flow of the generated video and penalizes how far if deviates from the given magnification factor.
arXiv Detail & Related papers (2023-11-28T18:59:51Z) - Breaking of brightness consistency in optical flow with a lightweight CNN network [7.601414191389451]
In this work, a lightweight network is used to extract robust convolutional features and corners with strong invariance.
Modifying the typical brightness consistency of the optical flow method to the convolutional feature consistency yields the light-robust hybrid optical flow method.
A more accurate visual inertial system is constructed by replacing the optical flow method in VINS-Mono.
arXiv Detail & Related papers (2023-10-24T09:10:43Z) - Offline and Online Optical Flow Enhancement for Deep Video Compression [14.445058335559994]
Motion information is represented as optical flows in most of the existing deep video compression networks.
We conduct experiments on a state-of-the-art deep video compression scheme, DCVC.
arXiv Detail & Related papers (2023-07-11T07:52:06Z) - GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion
Estimates [0.0]
Videos shot by laymen using hand-held cameras contain undesirable shaky motion.
Estimating the global motion between successive frames is central to many video stabilization techniques.
We introduce a more general representation scheme, which adapts any existing optical flow network to ignore the moving objects.
arXiv Detail & Related papers (2022-10-25T05:09:18Z) - Delta Distillation for Efficient Video Processing [68.81730245303591]
We propose a novel knowledge distillation schema coined as Delta Distillation.
We demonstrate that these temporal variations can be effectively distilled due to the temporal redundancies within video frames.
As a by-product, delta distillation improves the temporal consistency of the teacher model.
arXiv Detail & Related papers (2022-03-17T20:13:30Z) - Learning optical flow from still images [53.295332513139925]
We introduce a framework to generate accurate ground-truth optical flow annotations quickly and in large amounts from any readily available single real picture.
We virtually move the camera in the reconstructed environment with known motion vectors and rotation angles.
When trained with our data, state-of-the-art optical flow networks achieve superior generalization to unseen real data.
arXiv Detail & Related papers (2021-04-08T17:59:58Z) - Distilled Semantics for Comprehensive Scene Understanding from Videos [53.49501208503774]
In this paper, we take an additional step toward holistic scene understanding with monocular cameras by learning depth and motion alongside with semantics.
We address the three tasks jointly by a novel training protocol based on knowledge distillation and self-supervision.
We show that it yields state-of-the-art results for monocular depth estimation, optical flow and motion segmentation.
arXiv Detail & Related papers (2020-03-31T08:52:13Z) - Streaming Networks: Increase Noise Robustness and Filter Diversity via
Hard-wired and Input-induced Sparsity [0.2538209532048866]
Recent studies show that CNN's recognition accuracy drops drastically if images are noise corrupted.
We introduce a novel network architecture called Streaming Networks.
Results indicate that only the presence of both hard-wired and input-induces sparsity enables robust noisy image recognition.
arXiv Detail & Related papers (2020-03-30T16:58:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.