Modeling Language Usage and Listener Engagement in Podcasts
- URL: http://arxiv.org/abs/2106.06605v1
- Date: Fri, 11 Jun 2021 20:40:15 GMT
- Title: Modeling Language Usage and Listener Engagement in Podcasts
- Authors: Sravana Reddy, Marina Lazarova, Yongze Yu, and Rosie Jones
- Abstract summary: We investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax -- correlate with engagement.
We build models with different textual representations, and show that the identified features are highly predictive of engagement.
Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.
- Score: 3.8966039534272916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While there is an abundance of popular writing targeted to podcast creators
on how to speak in ways that engage their listeners, there has been little
data-driven analysis of podcasts that relates linguistic style with listener
engagement. In this paper, we investigate how various factors -- vocabulary
diversity, distinctiveness, emotion, and syntax, among others -- correlate with
engagement, based on analysis of the creators' written descriptions and
transcripts of the audio. We build models with different textual
representations, and show that the identified features are highly predictive of
engagement. Our analysis tests popular wisdom about stylistic elements in
high-engagement podcasts, corroborating some aspects, and adding new
perspectives on others.
Related papers
- Do Audio-Language Models Understand Linguistic Variations? [42.17718387132912]
Open-vocabulary audio language models (ALMs) represent a promising new paradigm for audio-text retrieval using natural language queries.
We propose RobustCLAP, a novel and compute-efficient technique to learn audio-language representations to linguistic variations.
arXiv Detail & Related papers (2024-10-21T20:55:33Z) - Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue [71.15186328127409]
Paralinguistics-enhanced Generative Pretrained Transformer (ParalinGPT)
Model takes the conversational context of text, speech embeddings, and paralinguistic attributes as input prompts within a serialized multitasking framework.
We utilize the Switchboard-1 corpus, including its sentiment labels as the paralinguistic attribute, as our spoken dialogue dataset.
arXiv Detail & Related papers (2023-12-23T18:14:56Z) - Dialogue Quality and Emotion Annotations for Customer Support
Conversations [7.218791626731783]
This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations.
It provides a unique and valuable resource for the development of text classification models.
arXiv Detail & Related papers (2023-11-23T10:56:14Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Visual-Aware Text-to-Speech [101.89332968344102]
We present a new visual-aware text-to-speech (VA-TTS) task to synthesize speech conditioned on both textual inputs and visual feedback of the listener in face-to-face communication.
We devise a baseline model to fuse phoneme linguistic information and listener visual signals for speech synthesis.
arXiv Detail & Related papers (2023-06-21T05:11:39Z) - Speaking the Language of Your Listener: Audience-Aware Adaptation via
Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience.
We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z) - A Benchmark for Understanding and Generating Dialogue between Characters
in Stories [75.29466820496913]
We present the first study to explore whether machines can understand and generate dialogue in stories.
We propose two new tasks including Masked Dialogue Generation and Dialogue Speaker Recognition.
We show the difficulty of the proposed tasks by testing existing models with automatic and manual evaluation on DialStory.
arXiv Detail & Related papers (2022-09-18T10:19:04Z) - Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations.
We autoregressively output multiple possibilities of corresponding listener motion.
Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z) - Who says like a style of Vitamin: Towards Syntax-Aware
DialogueSummarization using Multi-task Learning [2.251583286448503]
We focus on the association between utterances from individual speakers and unique syntactic structures.
Speakers have unique textual styles that can contain linguistic information, such as voiceprint.
We employ multi-task learning of both syntax-aware information and dialogue summarization.
arXiv Detail & Related papers (2021-09-29T05:30:39Z) - Detecting Extraneous Content in Podcasts [6.335863593761816]
We present a model that leverage both textual and listening patterns to detect extraneous content in podcast descriptions and audio transcripts.
We show that our models can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.
arXiv Detail & Related papers (2021-03-03T18:30:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.