Motion Style Transfer: Modular Low-Rank Adaptation for Deep Motion
Forecasting
- URL: http://arxiv.org/abs/2211.03165v1
- Date: Sun, 6 Nov 2022 16:14:17 GMT
- Title: Motion Style Transfer: Modular Low-Rank Adaptation for Deep Motion
Forecasting
- Authors: Parth Kothari, Danya Li, Yuejiang Liu, Alexandre Alahi
- Abstract summary: We propose a transfer learning approach for efficiently adapting deep motion forecasting models to new domains.
Unlike the conventional fine-tuning approach that updates the whole encoder, our main idea is to reduce the amount of tunable parameters.
We show that our proposed adapter design, coined MoSA, outperforms prior methods on several forecasting benchmarks.
- Score: 79.56014465244644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep motion forecasting models have achieved great success when trained on a
massive amount of data. Yet, they often perform poorly when training data is
limited. To address this challenge, we propose a transfer learning approach for
efficiently adapting pre-trained forecasting models to new domains, such as
unseen agent types and scene contexts. Unlike the conventional fine-tuning
approach that updates the whole encoder, our main idea is to reduce the amount
of tunable parameters that can precisely account for the target domain-specific
motion style. To this end, we introduce two components that exploit our prior
knowledge of motion style shifts: (i) a low-rank motion style adapter that
projects and adjusts the style features at a low-dimensional bottleneck; and
(ii) a modular adapter strategy that disentangles the features of scene context
and motion history to facilitate a fine-grained choice of adaptation layers.
Through extensive experimentation, we show that our proposed adapter design,
coined MoSA, outperforms prior methods on several forecasting benchmarks.
Related papers
- Adjusting Pretrained Backbones for Performativity [34.390793811659556]
We propose a novel technique to adjust pretrained backbones for performativity in a modular way.
We show how it leads to smaller loss along the retraining trajectory and enables us to effectively select among candidate models to anticipate performance degradations.
arXiv Detail & Related papers (2024-10-06T14:41:13Z) - Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z) - Efficient Adaptation of Large Vision Transformer via Adapter
Re-Composing [8.88477151877883]
High-capacity pre-trained models have revolutionized problem-solving in computer vision.
We propose a novel Adapter Re-Composing (ARC) strategy that addresses efficient pre-trained model adaptation.
Our approach considers the reusability of adaptation parameters and introduces a parameter-sharing scheme.
arXiv Detail & Related papers (2023-10-10T01:04:15Z) - Boost Video Frame Interpolation via Motion Adaptation [73.42573856943923]
Video frame (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
Existing learning-based VFI methods have achieved great success, but they still suffer from limited generalization ability.
We propose a novel optimization-based VFI method that can adapt to unseen motions at test time.
arXiv Detail & Related papers (2023-06-24T10:44:02Z) - Unsupervised Motion Representation Learning with Capsule Autoencoders [54.81628825371412]
Motion Capsule Autoencoder (MCAE) models motion in a two-level hierarchy.
MCAE is evaluated on a novel Trajectory20 motion dataset and various real-world skeleton-based human action datasets.
arXiv Detail & Related papers (2021-10-01T16:52:03Z) - EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content.
First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events.
Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z) - Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling.
It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z) - Adversarial Style Mining for One-Shot Unsupervised Domain Adaptation [43.351728923472464]
One-Shot Unsupervised Domain Adaptation assumes that only one unlabeled target sample can be available when learning to adapt.
Traditional adaptation approaches are prone to failure due to the scarce of unlabeled target data.
We propose a novel Adrial Style Mining approach, which combines the style transfer module and task-specific module into an adversarial manner.
arXiv Detail & Related papers (2020-04-13T16:18:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.