Visual Authority and the Rhetoric of Health Misinformation: A Multimodal Analysis of Social Media Videos
- URL: http://arxiv.org/abs/2509.20724v1
- Date: Thu, 25 Sep 2025 03:56:38 GMT
- Title: Visual Authority and the Rhetoric of Health Misinformation: A Multimodal Analysis of Social Media Videos
- Authors: Mohammad Reza Zarei, Barbara Stead-Coyle, Michael Christensen, Sarah Everts, Majid Komeili,
- Abstract summary: This study examines how credibility is packaged in nutrition and supplement videos by analyzing the intersection of authority signals, narrative techniques, and monetization.<n>We assemble a cross platform corpus of 152 public videos from TikTok, Instagram, and YouTube and annotate each on 26 features spanning visual authority, presenter attributes, narrative strategies, and engagement cues.
- Score: 1.4136330551561624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Short form video platforms are central sites for health advice, where alternative narratives mix useful, misleading, and harmful content. Rather than adjudicating truth, this study examines how credibility is packaged in nutrition and supplement videos by analyzing the intersection of authority signals, narrative techniques, and monetization. We assemble a cross platform corpus of 152 public videos from TikTok, Instagram, and YouTube and annotate each on 26 features spanning visual authority, presenter attributes, narrative strategies, and engagement cues. A transparent annotation pipeline integrates automatic speech recognition, principled frame selection, and a multimodal model, with human verification on a stratified subsample showing strong agreement. Descriptively, a confident single presenter in studio or home settings dominates, and clinical contexts are rare. Analytically, authority cues such as titles, slides and charts, and certificates frequently occur with persuasive elements including jargon, references, fear or urgency, critiques of mainstream medicine, and conspiracies, and with monetization including sales links and calls to subscribe. References and science like visuals often travel with emotive and oppositional narratives rather than signaling restraint.
Related papers
- Implicit Counterfactual Learning for Audio-Visual Segmentation [50.69377287012591]
We propose the implicit counterfactual framework (ICF) to achieve unbiased cross-modal understanding.<n>Due to the lack of semantics, heterogeneous representations may lead to erroneous matches.<n>We introduce the multi-granularity implicit text (MIT) involving video-, segment- and frame-level as the bridge to establish the modality-shared space.
arXiv Detail & Related papers (2025-07-28T11:46:35Z) - Video-Mediated Emotion Disclosure: Expressions of Fear, Sadness, and Joy by People with Schizophrenia on YouTube [2.767257448554864]
We analyzed 200 YouTube videos created by individuals with schizophrenia.<n>Our analysis revealed diverse practices of emotion disclosure through both verbal and visual channels.<n>We found that the deliberate construction of visual elements, including environmental settings, appears to foster more supportive and engaged viewer responses.
arXiv Detail & Related papers (2025-06-12T17:39:54Z) - Framing Analysis of Health-Related Narratives: Conspiracy versus
Mainstream Media [3.3181276611945263]
We investigate how the framing of health-related topics, such as COVID-19 and other diseases, differs between conspiracy and mainstream websites.
We find that health-related narratives in conspiracy media are predominantly framed in terms of beliefs, while mainstream media tend to present them in terms of science.
arXiv Detail & Related papers (2024-01-18T14:56:23Z) - Discourse Analysis for Evaluating Coherence in Video Paragraph Captions [99.37090317971312]
We are exploring a novel discourse based framework to evaluate the coherence of video paragraphs.
Central to our approach is the discourse representation of videos, which helps in modeling coherence of paragraphs conditioned on coherence of videos.
Our experiment results have shown that the proposed framework evaluates coherence of video paragraphs significantly better than all the baseline methods.
arXiv Detail & Related papers (2022-01-17T04:23:08Z) - Unboxing Engagement in YouTube Influencer Videos: An Attention-Based Approach [0.3686808512438362]
"What is said" through words (text) is more important than "how it is said" through imagery (video images) or acoustics (audio) in predicting video engagement.<n>We analyze unstructured data from long-form YouTube influencer videos.
arXiv Detail & Related papers (2020-12-22T19:32:52Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - How-to Present News on Social Media: A Causal Analysis of Editing News
Headlines for Boosting User Engagement [14.829079057399838]
We analyze media outlets' current practices using a data-driven approach.
We build a parallel corpus of original news articles and their corresponding tweets that eight media outlets shared.
Then, we explore how those media edited tweets against original headlines and the effects of such changes.
arXiv Detail & Related papers (2020-09-17T06:39:49Z) - "Notic My Speech" -- Blending Speech Patterns With Multimedia [65.91370924641862]
We propose a view-temporal attention mechanism to model both the view dependence and the visemic importance in speech recognition and understanding.
Our proposed method outperformed the existing work by 4.99% in terms of the viseme error rate.
We show that there is a strong correlation between our model's understanding of multi-view speech and the human perception.
arXiv Detail & Related papers (2020-06-12T06:51:55Z) - Multi-Modal Video Forensic Platform for Investigating Post-Terrorist
Attack Scenarios [55.82693757287532]
Large scale Video Analytic Platforms (VAP) assist law enforcement agencies (LEA) in identifying suspects and securing evidence.
We present a video analytic platform that integrates visual and audio analytic modules and fuses information from surveillance cameras and video uploads from eyewitnesses.
arXiv Detail & Related papers (2020-04-02T14:29:27Z) - Visually Guided Self Supervised Learning of Speech Representations [62.23736312957182]
We propose a framework for learning audio representations guided by the visual modality in the context of audiovisual speech.
We employ a generative audio-to-video training scheme in which we animate a still image corresponding to a given audio clip and optimize the generated video to be as close as possible to the real video of the speech segment.
We achieve state of the art results for emotion recognition and competitive results for speech recognition.
arXiv Detail & Related papers (2020-01-13T14:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.