Modeling Language Usage and Listener Engagement in Podcasts
- URL: http://arxiv.org/abs/2106.06605v1
- Date: Fri, 11 Jun 2021 20:40:15 GMT
- Title: Modeling Language Usage and Listener Engagement in Podcasts
- Authors: Sravana Reddy, Marina Lazarova, Yongze Yu, and Rosie Jones
- Abstract summary: We investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax -- correlate with engagement.
We build models with different textual representations, and show that the identified features are highly predictive of engagement.
Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.
- Score: 3.8966039534272916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While there is an abundance of popular writing targeted to podcast creators
on how to speak in ways that engage their listeners, there has been little
data-driven analysis of podcasts that relates linguistic style with listener
engagement. In this paper, we investigate how various factors -- vocabulary
diversity, distinctiveness, emotion, and syntax, among others -- correlate with
engagement, based on analysis of the creators' written descriptions and
transcripts of the audio. We build models with different textual
representations, and show that the identified features are highly predictive of
engagement. Our analysis tests popular wisdom about stylistic elements in
high-engagement podcasts, corroborating some aspects, and adding new
perspectives on others.
Related papers
- Annotation Tool and Dataset for Fact-Checking Podcasts [1.6804613362826175]
podcasts are a popular medium on the web, featuring diverse and multilingual content that often includes unverified claims.
Our tool offers a novel approach to tackle these challenges by enabling real-time annotation of contextual during playback.
This unique capability allows users to listen to the podcast and annotate key elements, such as check-worthy claims, claim spans, and contextual errors, simultaneously.
arXiv Detail & Related papers (2025-02-03T14:34:17Z) - Classification of Spontaneous and Scripted Speech for Multilingual Audio [9.925703861731506]
Distinguishing scripted from spontaneous speech is an essential tool for better understanding how speech styles influence speech processing research.
This paper addresses the challenge of building a classifier that generalises well across different formats and languages.
We systematically evaluate models ranging from traditional, handcrafted acoustic and prosodic features to advanced audio transformers.
arXiv Detail & Related papers (2024-12-16T15:45:10Z) - WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - Dialogue Quality and Emotion Annotations for Customer Support
Conversations [7.218791626731783]
This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations.
It provides a unique and valuable resource for the development of text classification models.
arXiv Detail & Related papers (2023-11-23T10:56:14Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words.
Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE.
We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z) - Visual-Aware Text-to-Speech [101.89332968344102]
We present a new visual-aware text-to-speech (VA-TTS) task to synthesize speech conditioned on both textual inputs and visual feedback of the listener in face-to-face communication.
We devise a baseline model to fuse phoneme linguistic information and listener visual signals for speech synthesis.
arXiv Detail & Related papers (2023-06-21T05:11:39Z) - Speaking the Language of Your Listener: Audience-Aware Adaptation via
Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience.
We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z) - A Benchmark for Understanding and Generating Dialogue between Characters
in Stories [75.29466820496913]
We present the first study to explore whether machines can understand and generate dialogue in stories.
We propose two new tasks including Masked Dialogue Generation and Dialogue Speaker Recognition.
We show the difficulty of the proposed tasks by testing existing models with automatic and manual evaluation on DialStory.
arXiv Detail & Related papers (2022-09-18T10:19:04Z) - Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations.
We autoregressively output multiple possibilities of corresponding listener motion.
Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z) - Who says like a style of Vitamin: Towards Syntax-Aware
DialogueSummarization using Multi-task Learning [2.251583286448503]
We focus on the association between utterances from individual speakers and unique syntactic structures.
Speakers have unique textual styles that can contain linguistic information, such as voiceprint.
We employ multi-task learning of both syntax-aware information and dialogue summarization.
arXiv Detail & Related papers (2021-09-29T05:30:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.