EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering
within Transformer
- URL: http://arxiv.org/abs/2312.04152v1
- Date: Thu, 7 Dec 2023 09:10:16 GMT
- Title: EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering
within Transformer
- Authors: Fei Wang, Dan Guo, Kun Li, Meng Wang
- Abstract summary: Video Motion Magnification (VMM) aims to break the resolution limit of human visual perception capability.
This paper proposes a novel dynamic filtering strategy to achieve static-dynamic field adaptive denoising.
We demonstrate extensive experiments that EulerMormer achieves more robust video motion magnification from the Eulerian perspective.
- Score: 30.470336098766765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video Motion Magnification (VMM) aims to break the resolution limit of human
visual perception capability and reveal the imperceptible minor motion that
contains valuable information in the macroscopic domain. However, challenges
arise in this task due to photon noise inevitably introduced by photographic
devices and spatial inconsistency in amplification, leading to flickering
artifacts in static fields and motion blur and distortion in dynamic fields in
the video. Existing methods focus on explicit motion modeling without
emphasizing prioritized denoising during the motion magnification process. This
paper proposes a novel dynamic filtering strategy to achieve static-dynamic
field adaptive denoising. Specifically, based on Eulerian theory, we separate
texture and shape to extract motion representation through inter-frame shape
differences, expecting to leverage these subdivided features to solve this task
finely. Then, we introduce a novel dynamic filter that eliminates noise cues
and preserves critical features in the motion magnification and amplification
generation phases. Overall, our unified framework, EulerMormer, is a pioneering
effort to first equip with Transformer in learning-based VMM. The core of the
dynamic filter lies in a global dynamic sparse cross-covariance attention
mechanism that explicitly removes noise while preserving vital information,
coupled with a multi-scale dual-path gating mechanism that selectively
regulates the dependence on different frequency features to reduce spatial
attenuation and complement motion boundaries. We demonstrate extensive
experiments that EulerMormer achieves more robust video motion magnification
from the Eulerian perspective, significantly outperforming state-of-the-art
methods. The source code is available at
https://github.com/VUT-HFUT/EulerMormer.
Related papers
- Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation [36.098738197088124]
This work presents a Diffusion Reuse MOtion network to accelerate latent video generation.
coarse-grained noises in earlier denoising steps have demonstrated high motion consistency across consecutive video frames.
Dr. Mo propagates those coarse-grained noises onto the next frame by incorporating carefully designed, lightweight inter-frame motions.
arXiv Detail & Related papers (2024-09-19T07:50:34Z) - Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - Spectral Motion Alignment for Video Motion Transfer using Diffusion Models [54.32923808964701]
Spectral Motion Alignment (SMA) is a framework that refines and aligns motion vectors using Fourier and wavelet transforms.
SMA learns motion patterns by incorporating frequency-domain regularization, facilitating the learning of whole-frame global motion dynamics.
Extensive experiments demonstrate SMA's efficacy in improving motion transfer while maintaining computational efficiency and compatibility across various video customization frameworks.
arXiv Detail & Related papers (2024-03-22T14:47:18Z) - SMURF: Continuous Dynamics for Motion-Deblurring Radiance Fields [14.681688453270523]
We propose sequential motion understanding radiance fields (SMURF), a novel approach that employs neural ordinary differential equation (Neural-ODE) to model continuous camera motion.
Our model, rigorously evaluated against benchmark datasets, demonstrates state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-03-12T11:32:57Z) - Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture [42.51987004849891]
Video Motion Magnification aims to reveal subtle and imperceptible motion information of objects in the macroscopic world.
We present FD4MM, a new paradigm of Frequency Decoupling for Motion Magnification with a Multi-level Isomorphic Architecture.
We show that FD4MM reduces FLOPs by 1.63$times$ and boosts inference speed by 1.68$times$ than the latest method.
arXiv Detail & Related papers (2024-03-12T06:07:29Z) - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via
Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.
It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping.
Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z) - Alignment-free HDR Deghosting with Semantics Consistent Transformer [76.91669741684173]
High dynamic range imaging aims to retrieve information from multiple low-dynamic range inputs to generate realistic output.
Existing methods often focus on the spatial misalignment across input frames caused by the foreground and/or camera motion.
We propose a novel alignment-free network with a Semantics Consistent Transformer (SCTNet) with both spatial and channel attention modules.
arXiv Detail & Related papers (2023-05-29T15:03:23Z) - NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression
Recognition [1.8604727699812171]
We propose a noise-robust dynamic facial expression recognition network (NR-DFERNet) to reduce the interference of noisy frames on the DFER task.
Specifically, at the spatial stage, we devise a dynamic-static fusion module (DSF) that introduces dynamic features to static features for learning more discriminative spatial features.
To suppress the impact of target irrelevant frames, we introduce a novel dynamic class token (DCT) for the transformer at the temporal stage.
arXiv Detail & Related papers (2022-06-10T10:17:30Z) - MoCo-Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary
Monocular Cameras [98.40768911788854]
We introduce MoCo-Flow, a representation that models the dynamic scene using a 4D continuous time-variant function.
At the heart of our work lies a novel optimization formulation, which is constrained by a motion consensus regularization on the motion flow.
We extensively evaluate MoCo-Flow on several datasets that contain human motions of varying complexity.
arXiv Detail & Related papers (2021-06-08T16:03:50Z) - Event-based Motion Segmentation with Spatio-Temporal Graph Cuts [51.17064599766138]
We have developed a method to identify independently objects acquired with an event-based camera.
The method performs on par or better than the state of the art without having to predetermine the number of expected moving objects.
arXiv Detail & Related papers (2020-12-16T04:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.