Hi Sheldon! Creating Deep Personalized Characters from TV Shows
- URL: http://arxiv.org/abs/2304.11093v1
- Date: Sun, 9 Apr 2023 00:39:43 GMT
- Title: Hi Sheldon! Creating Deep Personalized Characters from TV Shows
- Authors: Meidai Xuanyuan, Yuwang Wang, Honglei Guo, Xiao Ma, Yuchen Guo, Tao
Yu, Qionghai Dai
- Abstract summary: We propose a novel task, named Deep Personalized Character Creation (DPCC), creating multimodal chat personalized characters from multimodal data such as TV shows.
Given a single- or multi-modality input (text, audio, video), the goal of DPCC is to generate a multi-modality (text, audio, video) response.
To support this novel task, we further collect a character centric multimodal dialogue dataset, named Deep Personalized Character dataset (DPCD), from TV shows.
DPCD contains character-specific multimodal dialogue data of 10k utterances and 6 hours of audio/
- Score: 52.8086853239762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imagine an interesting multimodal interactive scenario that you can see,
hear, and chat with an AI-generated digital character, who is capable of
behaving like Sheldon from The Big Bang Theory, as a DEEP copy from appearance
to personality. Towards this fantastic multimodal chatting scenario, we propose
a novel task, named Deep Personalized Character Creation (DPCC): creating
multimodal chat personalized characters from multimodal data such as TV shows.
Specifically, given a single- or multi-modality input (text, audio, video), the
goal of DPCC is to generate a multi-modality (text, audio, video) response,
which should be well-matched the personality of a specific character such as
Sheldon, and of high quality as well. To support this novel task, we further
collect a character centric multimodal dialogue dataset, named Deep
Personalized Character Dataset (DPCD), from TV shows. DPCD contains
character-specific multimodal dialogue data of ~10k utterances and ~6 hours of
audio/video per character, which is around 10 times larger compared to existing
related datasets.On DPCD, we present a baseline method for the DPCC task and
create 5 Deep personalized digital Characters (DeepCharacters) from Big Bang TV
Shows. We conduct both subjective and objective experiments to evaluate the
multimodal response from DeepCharacters in terms of characterization and
quality. The results demonstrates that, on our collected DPCD dataset, the
proposed baseline can create personalized digital characters for generating
multimodal response.Our collected DPCD dataset, the code of data collection and
our baseline will be published soon.
Related papers
- Crafting Customisable Characters with LLMs: Introducing SimsChat, a Persona-Driven Role-Playing Agent Framework [29.166067413153353]
Large Language Models (LLMs) can comprehend human instructions and generate high-quality text.
We introduce the Customisable Conversation Agent Framework, which leverages LLMs to simulate real-world characters.
We present SimsChat, a freely customisable role-playing agent.
arXiv Detail & Related papers (2024-06-25T22:44:17Z) - PSYDIAL: Personality-based Synthetic Dialogue Generation using Large Language Models [4.283022729693451]
We present a novel end-to-end personality-based synthetic dialogue data generation pipeline, specifically designed to elicit responses from large language models via prompting.
We introduce PSYDIAL, the first Korean dialogue dataset focused on personality-based dialogues, curated using our proposed pipeline.
Experimental results indicate that while pre-trained models and those fine-tuned with a chit-chat dataset struggle to generate responses reflecting personality, models trained with PSYDIAL show significant improvements.
arXiv Detail & Related papers (2024-04-01T05:19:34Z) - CharacterGLM: Customizing Chinese Conversational AI Characters with
Large Language Models [66.4382820107453]
We present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters.
Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs.
arXiv Detail & Related papers (2023-11-28T14:49:23Z) - MPCHAT: Towards Multimodal Persona-Grounded Conversation [54.800425322314105]
We extend persona-based dialogue to the multimodal domain and make two main contributions.
First, we present the first multimodal persona-based dialogue dataset named MPCHAT.
Second, we empirically show that incorporating multimodal persona, as measured by three proposed multimodal persona-grounded dialogue tasks, leads to statistically significant performance improvements.
arXiv Detail & Related papers (2023-05-27T06:46:42Z) - Personality-aware Human-centric Multimodal Reasoning: A New Task,
Dataset and Baselines [32.82738983843281]
We introduce a new task called Personality-aware Human-centric Multimodal Reasoning (PHMR) (T1)
The goal of the task is to forecast the future behavior of a particular individual using multimodal information from past instances, while integrating personality factors.
The experimental results demonstrate that incorporating personality traits enhances human-centric multimodal reasoning performance.
arXiv Detail & Related papers (2023-04-05T09:09:10Z) - M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in
Conversations [72.81164101048181]
We propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se"
Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities.
The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition.
arXiv Detail & Related papers (2021-08-03T02:54:09Z) - Personalized Multimodal Feedback Generation in Education [50.95346877192268]
The automatic evaluation for school assignments is an important application of AI in the education field.
We propose a novel Personalized Multimodal Feedback Generation Network (PMFGN) armed with a modality gate mechanism and a personalized bias mechanism.
Our model significantly outperforms several baselines by generating more accurate and diverse feedback.
arXiv Detail & Related papers (2020-10-31T05:26:49Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.