Related papers: Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement

Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement

URL: http://arxiv.org/abs/2303.12059v4
Date: Mon, 6 Nov 2023 09:32:19 GMT
Title: Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement
Authors: Akshay Paruchuri, Xin Liu, Yulu Pan, Shwetak Patel, Daniel McDuff, Soumyadip Sengupta
Abstract summary: Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We adapt a neural video synthesis approach to augment videos for the task of remote photoplethys. We demonstrate a 47% improvement over existing inter-dataset results using various state-of-the-art methods.
Score: 25.27559386977351
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models for camera-based physiological measurement can have weak generalization due to a lack of representative training data. Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We explore motion transfer as a form of data augmentation to introduce motion variation while preserving physiological changes of interest. We adapt a neural video synthesis approach to augment videos for the task of remote photoplethysmography (rPPG) and study the effects of motion augmentation with respect to 1) the magnitude and 2) the type of motion. After training on motion-augmented versions of publicly available datasets, we demonstrate a 47% improvement over existing inter-dataset results using various state-of-the-art methods on the PURE dataset. We also present inter-dataset results on five benchmark datasets to show improvements of up to 79% using TS-CAN, a neural rPPG estimation method. Our findings illustrate the usefulness of motion transfer as a data augmentation technique for improving the generalization of models for camera-based physiological sensing. We release our code for using motion transfer as a data augmentation technique on three publicly available datasets, UBFC-rPPG, PURE, and SCAMPS, and models pre-trained on motion-augmented data here: https://motion-matters.github.io/

Related papers

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation [55.238542326124545]
Image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal. These motion-aware models are appealing to generate diverse motion patterns, yet there lacks a reliable motion estimator for training such models on large-scale video set in the wild. This paper addresses the challenge with a new motion estimator, capable of measuring the decoupled motion intensities of objects and cameras in video.
arXiv Detail & Related papers (2024-12-08T08:12:37Z)
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation [79.00294932026266]
VidMan is a novel framework that employs a two-stage training mechanism to enhance stability and improve data utilization efficiency. Our framework outperforms state-of-the-art baseline model GR-1 on the CALVIN benchmark, achieving a 11.7% relative improvement, and demonstrates over 9% precision gains on the OXE small-scale dataset.
arXiv Detail & Related papers (2024-11-14T03:13:26Z)
E-Motion: Future Motion Simulation via Event Sequence Diffusion [86.80533612211502]
Event-based sensors may potentially offer a unique opportunity to predict future motion with a level of detail and precision previously unachievable. We propose to integrate the strong learning capacity of the video diffusion model with the rich motion information of an event camera as a motion simulation framework. Our findings suggest a promising direction for future research in enhancing the interpretative power and predictive accuracy of computer vision systems.
arXiv Detail & Related papers (2024-10-11T09:19:23Z)
Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models [70.78051873517285]
We present MotionBase, the first million-level motion generation benchmark. By leveraging this vast dataset, our large motion model demonstrates strong performance across a broad range of motions. We introduce a novel 2D lookup-free approach for motion tokenization, which preserves motion information and expands codebook capacity.
arXiv Detail & Related papers (2024-10-04T10:48:54Z)
MotionFix: Text-Driven 3D Human Motion Editing [52.11745508960547]
Key challenges include the scarcity of training data and the need to design a model that accurately edits the source motion. We propose a methodology to semi-automatically collect a dataset of triplets comprising (i) a source motion, (ii) a target motion, and (iii) an edit text. Access to this data allows us to train a conditional diffusion model, TMED, that takes both the source motion and the edit text as input.
arXiv Detail & Related papers (2024-08-01T16:58:50Z)
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches [12.221087476416056]
We introduce "motion patches", a new representation of motion sequences, and propose using Vision Transformers (ViT) as motion encoders via transfer learning. These motion patches, created by dividing and sorting skeleton joints based on motion sequences, are robust to varying skeleton structures. We find that transfer learning with pre-trained weights of ViT obtained through training with 2D image data can boost the performance of motion analysis.
arXiv Detail & Related papers (2024-05-08T02:42:27Z)
Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation [23.199005573530194]
We leverage the 3D facial surface to construct a novel orientation-conditioned video representation. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios.
arXiv Detail & Related papers (2024-04-14T23:30:35Z)
3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data [4.130944152992895]
We propose a novel biomechanics-aware network that directly outputs 3D kinematics from two input views. Our experiments demonstrate that the proposed approach, only trained on synthetic data, outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2024-02-20T17:33:40Z)
Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors [17.3834029178939]
This paper introduces a novel human pose estimation approach using sparse inertial sensors. It leverages a diverse array of real inertial motion capture data from different skeleton formats to improve motion diversity and model generalization. The approach demonstrates superior performance over state-of-the-art models across five public datasets, notably reducing pose error by 19% on the DIP-IMU dataset.
arXiv Detail & Related papers (2023-12-02T13:17:10Z)
MotionAug: Augmentation with Physical Correction for Human Motion Prediction [19.240717471864723]
This paper presents a motion data augmentation scheme incorporating motion synthesis encouraging diversity and motion correction imposing physical plausibility. Our method outperforms previous noise-based motion augmentation methods by a large margin on both Recurrent Neural Network-based and Graph Convolutional Network-based human motion prediction models.
arXiv Detail & Related papers (2022-03-17T06:53:15Z)
Render In-between: Motion Guided Video Synthesis for Action Interpolation [53.43607872972194]
We propose a motion-guided frame-upsampling framework that is capable of producing realistic human motion and appearance. A novel motion model is trained to inference the non-linear skeletal motion between frames by leveraging a large-scale motion-capture dataset. Our pipeline only requires low-frame-rate videos and unpaired human motion data but does not require high-frame-rate videos for training.
arXiv Detail & Related papers (2021-11-01T15:32:51Z)
Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field. It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations. Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.