Related papers: Modeling Language Usage and Listener Engagement in Podcasts

Modeling Language Usage and Listener Engagement in Podcasts

URL: http://arxiv.org/abs/2106.06605v1
Date: Fri, 11 Jun 2021 20:40:15 GMT
Title: Modeling Language Usage and Listener Engagement in Podcasts
Authors: Sravana Reddy, Marina Lazarova, Yongze Yu, and Rosie Jones
Abstract summary: We investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax -- correlate with engagement. We build models with different textual representations, and show that the identified features are highly predictive of engagement. Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.
Score: 3.8966039534272916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While there is an abundance of popular writing targeted to podcast creators on how to speak in ways that engage their listeners, there has been little data-driven analysis of podcasts that relates linguistic style with listener engagement. In this paper, we investigate how various factors -- vocabulary diversity, distinctiveness, emotion, and syntax, among others -- correlate with engagement, based on analysis of the creators' written descriptions and transcripts of the audio. We build models with different textual representations, and show that the identified features are highly predictive of engagement. Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some aspects, and adding new perspectives on others.

Related papers

MoonCast: High-Quality Zero-Shot Podcast Generation [81.29927724674602]
MoonCast is a solution for high-quality zero-shot podcast generation. It aims to synthesize natural podcast-style speech from text-only sources. Experiments demonstrate that MoonCast outperforms baselines.
arXiv Detail & Related papers (2025-03-18T15:25:08Z)
Annotation Tool and Dataset for Fact-Checking Podcasts [1.6804613362826175]
podcasts are a popular medium on the web, featuring diverse and multilingual content that often includes unverified claims. Our tool offers a novel approach to tackle these challenges by enabling real-time annotation of contextual during playback. This unique capability allows users to listen to the podcast and annotate key elements, such as check-worthy claims, claim spans, and contextual errors, simultaneously.
arXiv Detail & Related papers (2025-02-03T14:34:17Z)
Classification of Spontaneous and Scripted Speech for Multilingual Audio [9.925703861731506]
Distinguishing scripted from spontaneous speech is an essential tool for better understanding how speech styles influence speech processing research. This paper addresses the challenge of building a classifier that generalises well across different formats and languages. We systematically evaluate models ranging from traditional, handcrafted acoustic and prosodic features to advanced audio transformers.
arXiv Detail & Related papers (2024-12-16T15:45:10Z)
WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain. These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech. Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z)
Do Audio-Language Models Understand Linguistic Variations? [42.17718387132912]
Open-vocabulary audio language models (ALMs) represent a promising new paradigm for audio-text retrieval using natural language queries. We propose RobustCLAP, a novel and compute-efficient technique to learn audio-language representations to linguistic variations.
arXiv Detail & Related papers (2024-10-21T20:55:33Z)
Dialogue Quality and Emotion Annotations for Customer Support Conversations [7.218791626731783]
This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations. It provides a unique and valuable resource for the development of text classification models.
arXiv Detail & Related papers (2023-11-23T10:56:14Z)
Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective. We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way. We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z)
Can Language Models Learn to Listen? [96.01685069483025]
We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker's words. Our approach autoregressively predicts a response of a listener: a sequence of listener facial gestures, quantized using a VQ-VAE. We show that our generated listener motion is fluent and reflective of language semantics through quantitative metrics and a qualitative user study.
arXiv Detail & Related papers (2023-08-21T17:59:02Z)
Visual-Aware Text-to-Speech [101.89332968344102]
We present a new visual-aware text-to-speech (VA-TTS) task to synthesize speech conditioned on both textual inputs and visual feedback of the listener in face-to-face communication. We devise a baseline model to fuse phoneme linguistic information and listener visual signals for speech synthesis.
arXiv Detail & Related papers (2023-06-21T05:11:39Z)
Speaking the Language of Your Listener: Audience-Aware Adaptation via Plug-and-Play Theory of Mind [4.052000839878213]
We model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience. We endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective.
arXiv Detail & Related papers (2023-05-31T15:17:28Z)
A Benchmark for Understanding and Generating Dialogue between Characters in Stories [75.29466820496913]
We present the first study to explore whether machines can understand and generate dialogue in stories. We propose two new tasks including Masked Dialogue Generation and Dialogue Speaker Recognition. We show the difficulty of the proposed tasks by testing existing models with automatic and manual evaluation on DialStory.
arXiv Detail & Related papers (2022-09-18T10:19:04Z)
Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations. We autoregressively output multiple possibilities of corresponding listener motion. Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z)
Who says like a style of Vitamin: Towards Syntax-Aware DialogueSummarization using Multi-task Learning [2.251583286448503]
We focus on the association between utterances from individual speakers and unique syntactic structures. Speakers have unique textual styles that can contain linguistic information, such as voiceprint. We employ multi-task learning of both syntax-aware information and dialogue summarization.
arXiv Detail & Related papers (2021-09-29T05:30:39Z)
Detecting Extraneous Content in Podcasts [6.335863593761816]
We present a model that leverage both textual and listening patterns to detect extraneous content in podcast descriptions and audio transcripts. We show that our models can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.
arXiv Detail & Related papers (2021-03-03T18:30:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.