STB-VMM: Swin Transformer Based Video Motion Magnification
- URL: http://arxiv.org/abs/2302.10001v2
- Date: Mon, 27 Mar 2023 20:18:45 GMT
- Title: STB-VMM: Swin Transformer Based Video Motion Magnification
- Authors: Ricard Lado-Roig\'e, Marco A. P\'erez
- Abstract summary: This work presents a new state-of-the-art model based on the Swin Transformer.
It offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness, and artifacts than prior-art.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of video motion magnification techniques is to magnify small motions
in a video to reveal previously invisible or unseen movement. Its uses extend
from bio-medical applications and deepfake detection to structural modal
analysis and predictive maintenance. However, discerning small motion from
noise is a complex task, especially when attempting to magnify very subtle,
often sub-pixel movement. As a result, motion magnification techniques
generally suffer from noisy and blurry outputs. This work presents a new
state-of-the-art model based on the Swin Transformer, which offers better
tolerance to noisy inputs as well as higher-quality outputs that exhibit less
noise, blurriness, and artifacts than prior-art. Improvements in output image
quality will enable more precise measurements for any application reliant on
magnified video sequences, and may enable further development of video motion
magnification techniques in new technical fields.
Related papers
- Revisiting Learning-based Video Motion Magnification for Real-time
Processing [23.148430647367224]
Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye.
We introduce a real-time deep learning-based motion magnification model with4.2X fewer FLOPs and is 2.7X faster than the prior art.
arXiv Detail & Related papers (2024-03-04T09:57:08Z) - Event-Based Motion Magnification [28.057537257958963]
We propose a dual-camera system consisting of an event camera and a conventional RGB camera for video motion magnification.
This innovative combination enables a broad and cost-effective amplification of high-frequency motions.
We demonstrate the effectiveness and accuracy of our dual-camera system and network, offering a cost-effective and flexible solution for motion detection and magnification.
arXiv Detail & Related papers (2024-02-19T08:59:58Z) - Learning-based Axial Video Motion Magnification [15.491931417718837]
We propose a new concept, axial motion magnification, which magnifies motions along the user-specified direction.
Our proposed method improves the legibility of resulting motions along certain axes by adding a new feature: user controllability.
arXiv Detail & Related papers (2023-12-15T06:04:42Z) - VMC: Video Motion Customization using Temporal Attention Adaption for
Text-to-Video Diffusion Models [58.93124686141781]
Video Motion Customization (VMC) is a novel one-shot tuning approach crafted to adapt temporal attention layers within video diffusion models.
Our approach introduces a novel motion distillation objective using residual vectors between consecutive frames as a motion reference.
We validate our method against state-of-the-art video generative models across diverse real-world motions and contexts.
arXiv Detail & Related papers (2023-12-01T06:50:11Z) - Cinematic Behavior Transfer via NeRF-based Differentiable Filming [63.1622492808519]
Existing SLAM methods face limitations in dynamic scenes and human pose estimation often focuses on 2D projections.
We first introduce a reverse filming behavior estimation technique.
We then introduce a cinematic transfer pipeline that is able to transfer various shot types to a new 2D video or a 3D virtual environment.
arXiv Detail & Related papers (2023-11-29T15:56:58Z) - 3D Motion Magnification: Visualizing Subtle Motions with Time Varying
Radiance Fields [58.6780687018956]
We present a 3D motion magnification method that can magnify subtle motions from scenes captured by a moving camera.
We represent the scene with time-varying radiance fields and leverage the Eulerian principle for motion magnification.
We evaluate the effectiveness of our method on both synthetic and real-world scenes captured under various camera setups.
arXiv Detail & Related papers (2023-08-07T17:59:59Z) - LaMD: Latent Motion Diffusion for Video Generation [69.4111397077229]
latent motion diffusion (LaMD) framework consists of a motion-decomposed video autoencoder and a diffusion-based motion generator.
Results show that LaMD generates high-quality videos with a wide range of motions, from dynamics to highly controllable movements.
arXiv Detail & Related papers (2023-04-23T10:32:32Z) - Learning Variational Motion Prior for Video-based Motion Capture [31.79649766268877]
We present a novel variational motion prior (VMP) learning approach for video-based motion capture.
Our framework can effectively reduce temporal jittering and failure modes in frame-wise pose estimation.
Experiments over both public datasets and in-the-wild videos have demonstrated the efficacy and generalization capability of our framework.
arXiv Detail & Related papers (2022-10-27T02:45:48Z) - Motion-blurred Video Interpolation and Extrapolation [72.3254384191509]
We present a novel framework for deblurring, interpolating and extrapolating sharp frames from a motion-blurred video in an end-to-end manner.
To ensure temporal coherence across predicted frames and address potential temporal ambiguity, we propose a simple, yet effective flow-based rule.
arXiv Detail & Related papers (2021-03-04T12:18:25Z) - Enhanced Quadratic Video Interpolation [56.54662568085176]
We propose an enhanced quadratic video (EQVI) model to handle more complicated scenes and motion patterns.
To further boost the performance, we devise a novel multi-scale fusion network (MS-Fusion) which can be regarded as a learnable augmentation process.
The proposed EQVI model won the first place in the AIM 2020 Video Temporal Super-Resolution Challenge.
arXiv Detail & Related papers (2020-09-10T02:31:50Z) - Prior-enlightened and Motion-robust Video Deblurring [29.158836861982742]
PRiOr-enlightened and MOTION-robust deblurring model (PROMOTION) suitable for challenging blurs.
We use 3D group convolution to efficiently encode heterogeneous prior information.
We also design the priors representing blur distribution, to better handle non-uniform blur-temporal domain.
arXiv Detail & Related papers (2020-03-25T04:16:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.