TranSTYLer: Multimodal Behavioral Style Transfer for Facial and Body
Gestures Generation
- URL: http://arxiv.org/abs/2308.10843v1
- Date: Tue, 8 Aug 2023 15:42:35 GMT
- Title: TranSTYLer: Multimodal Behavioral Style Transfer for Facial and Body
Gestures Generation
- Authors: Mireille Fares, Catherine Pelachaud, Nicolas Obin
- Abstract summary: This paper addresses the challenge of transferring the behavior expressivity style of a virtual agent to another one.
We propose a multimodal transformer based model that synthesizes the multimodal behaviors of a source speaker with the style of a target speaker.
- Score: 2.7317088388886384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the challenge of transferring the behavior expressivity
style of a virtual agent to another one while preserving behaviors shape as
they carry communicative meaning. Behavior expressivity style is viewed here as
the qualitative properties of behaviors. We propose TranSTYLer, a multimodal
transformer based model that synthesizes the multimodal behaviors of a source
speaker with the style of a target speaker. We assume that behavior
expressivity style is encoded across various modalities of communication,
including text, speech, body gestures, and facial expressions. The model
employs a style and content disentanglement schema to ensure that the
transferred style does not interfere with the meaning conveyed by the source
behaviors. Our approach eliminates the need for style labels and allows the
generalization to styles that have not been seen during the training phase. We
train our model on the PATS corpus, which we extended to include dialog acts
and 2D facial landmarks. Objective and subjective evaluations show that our
model outperforms state of the art models in style transfer for both seen and
unseen styles during training. To tackle the issues of style and content
leakage that may arise, we propose a methodology to assess the degree to which
behavior and gestures associated with the target style are successfully
transferred, while ensuring the preservation of the ones related to the source
content.
Related papers
- Dyadic Interaction Modeling for Social Behavior Generation [6.626277726145613]
We present an effective framework for creating 3D facial motions in dyadic interactions.
The heart of our framework is Dyadic Interaction Modeling (DIM), a pre-training approach.
Experiments demonstrate the superiority of our framework in generating listener motions.
arXiv Detail & Related papers (2024-03-14T03:21:33Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - ZS-MSTM: Zero-Shot Style Transfer for Gesture Animation driven by Text
and Speech using Adversarial Disentanglement of Multimodal Style Encoding [3.609538870261841]
We propose a machine learning approach to synthesize gestures, driven by prosodic features and text, in the style of different speakers.
Our model incorporates zero-shot multimodal style transfer using multimodal data from the PATS database.
arXiv Detail & Related papers (2023-05-22T10:10:35Z) - ALADIN-NST: Self-supervised disentangled representation learning of
artistic style through Neural Style Transfer [60.6863849241972]
We learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image.
We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics.
arXiv Detail & Related papers (2023-04-12T10:33:18Z) - StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
Tokenizer of a Large-Scale Generative Model [64.26721402514957]
We propose StylerDALLE, a style transfer method that uses natural language to describe abstract art styles.
Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation.
To incorporate style information, we propose a Reinforcement Learning strategy with CLIP-based language supervision.
arXiv Detail & Related papers (2023-03-16T12:44:44Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Conversation Style Transfer using Few-Shot Learning [56.43383396058639]
In this paper, we introduce conversation style transfer as a few-shot learning problem.
We propose a novel in-context learning approach to solve the task with style-free dialogues as a pivot.
We show that conversation style transfer can also benefit downstream tasks.
arXiv Detail & Related papers (2023-02-16T15:27:00Z) - Zero-Shot Style Transfer for Gesture Animation driven by Text and Speech
using Adversarial Disentanglement of Multimodal Style Encoding [3.2116198597240846]
We propose an efficient yet effective machine learning approach to synthesize gestures driven by prosodic features and text in the style of different speakers.
Our model performs zero shot multimodal style transfer driven by multimodal data from the PATS database containing videos of various speakers.
arXiv Detail & Related papers (2022-08-03T08:49:55Z) - Text-driven Emotional Style Control and Cross-speaker Style Transfer in
Neural TTS [7.384726530165295]
Style control of synthetic speech is often restricted to discrete emotion categories.
We propose a text-based interface for emotional style control and cross-speaker style transfer in multi-speaker TTS.
arXiv Detail & Related papers (2022-07-13T07:05:44Z) - Exploring Contextual Word-level Style Relevance for Unsupervised Style
Transfer [60.07283363509065]
Unsupervised style transfer aims to change the style of an input sentence while preserving its original content.
We propose a novel attentional sequence-to-sequence model that exploits the relevance of each output word to the target style.
Experimental results show that our proposed model achieves state-of-the-art performance in terms of both transfer accuracy and content preservation.
arXiv Detail & Related papers (2020-05-05T10:24:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.