YouTube Video Analytics for Patient Engagement: Evidence from Colonoscopy Preparation Videos
- URL: http://arxiv.org/abs/2410.02830v1
- Date: Tue, 1 Oct 2024 19:38:46 GMT
- Title: YouTube Video Analytics for Patient Engagement: Evidence from Colonoscopy Preparation Videos
- Authors: Yawen Guo, Xiao Liu, Anjana Susarla, Padman Rema,
- Abstract summary: This study demonstrates a data analysis pipeline that utilizes methods to retrieve medical information from YouTube videos.
We first use the YouTube Data API to collect metadata of desired videos on select search keywords.
Then we annotate the YouTube video materials on medical information, video understandability and overall recommendation.
- Score: 3.7941428390253193
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Videos can be an effective way to deliver contextualized, just-in-time medical information for patient education. However, video analysis, from topic identification and retrieval to extraction and analysis of medical information and understandability from a patient perspective are extremely challenging tasks. This study demonstrates a data analysis pipeline that utilizes methods to retrieve medical information from YouTube videos on preparing for a colonoscopy exam, a much maligned and disliked procedure that patients find challenging to get adequately prepared for. We first use the YouTube Data API to collect metadata of desired videos on select search keywords and use Google Video Intelligence API to analyze texts, frames and objects data. Then we annotate the YouTube video materials on medical information, video understandability and overall recommendation. We develop a bidirectional long short-term memory (BiLSTM) model to identify medical terms in videos and build three classifiers to group videos based on the levels of encoded medical information and video understandability, and whether the videos are recommended or not. Our study provides healthcare stakeholders with guidelines and a scalable approach for generating new educational video content to enhance management of a vast number of health conditions.
Related papers
- Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding [63.82450803014141]
Long-form video understanding presents significant challenges due to extensive temporal-spatial complexity.<n>We propose the Deep Video Discovery agent to leverage an agentic search strategy over segmented video clips.<n>Our DVD agent achieves SOTA performance, significantly surpassing prior works by a large margin on the challenging LVBench dataset.
arXiv Detail & Related papers (2025-05-23T16:37:36Z) - VideoPath-LLaVA: Pathology Diagnostic Reasoning Through Video Instruction Tuning [2.6954348706500766]
We present VideoPath-LLaVA, the first large multimodal model (LMM) in computational pathology.<n>It integrates three distinct image scenarios, single patch images, automatically-extracted clips, and manually segmented video pathology images.<n>By generating detailed histological descriptions and culminating in a definitive sign-out diagnosis, VideoPath-LLaVA bridges visual narratives with diagnostic reasoning.
arXiv Detail & Related papers (2025-05-07T07:41:19Z) - Watch and Learn: Leveraging Expert Knowledge and Language for Surgical Video Understanding [1.024113475677323]
The lack of datasets hinders the development of accurate and comprehensive workflow analysis solutions.
We introduce a novel approach for addressing the sparsity and heterogeneity of data inspired by the human learning procedure of watching experts and understanding their explanations.
We present the first comprehensive solution for dense video captioning (DVC) of surgical videos, addressing this task despite the absence of existing datasets in the surgical domain.
arXiv Detail & Related papers (2025-03-14T13:36:13Z) - Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos [106.5804660736763]
Video information retrieval remains a fundamental approach for accessing video content.
We build on the observation that retrieval models often favor AI-generated content in ad-hoc and image retrieval tasks.
We investigate whether similar biases emerge in the context of challenging video retrieval.
arXiv Detail & Related papers (2025-02-11T07:43:47Z) - VideoRAG: Retrieval-Augmented Generation over Video Corpus [57.68536380621672]
VideoRAG is a framework that dynamically retrieves videos based on their relevance with queries.
VideoRAG is powered by recent Large Video Language Models (LVLMs)
We experimentally validate the effectiveness of VideoRAG, showcasing that it is superior to relevant baselines.
arXiv Detail & Related papers (2025-01-10T11:17:15Z) - Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs [20.168429351519055]
Video understanding is a crucial next step for multimodal large language models (LMLMs)
We propose VideoNIAH (Video Needle In A Haystack), a benchmark construction framework through synthetic video generation.
We conduct a comprehensive evaluation of both proprietary and open-source models, uncovering significant differences in their video understanding capabilities.
arXiv Detail & Related papers (2024-06-13T17:50:05Z) - Detours for Navigating Instructional Videos [58.1645668396789]
We propose VidDetours, a video-language approach that learns to retrieve the targeted temporal segments from a large repository of how-to's.
We show our model's significant improvements over best available methods for video retrieval and question answering, with recall rates exceeding the state of the art by 35%.
arXiv Detail & Related papers (2024-01-03T16:38:56Z) - Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating
Video-based Large Language Models [81.84810348214113]
Video-based large language models (Video-LLMs) have been recently introduced, targeting both fundamental improvements in perception and comprehension, and a diverse range of user inquiries.
To guide the development of such a model, the establishment of a robust and comprehensive evaluation system becomes crucial.
This paper proposes textitVideo-Bench, a new comprehensive benchmark along with a toolkit specifically designed for evaluating Video-LLMs.
arXiv Detail & Related papers (2023-11-27T18:59:58Z) - Query-aware Long Video Localization and Relation Discrimination for Deep
Video Understanding [15.697251303126874]
Deep Video Understanding (DVU) Challenge aims to push the boundaries of multimodal extraction, fusion, and analytics.
This paper introduces a query-aware method for long video localization and relation discrimination, leveraging an imagelanguage pretrained model.
Our approach achieved first and fourth positions for two groups of movie-level queries.
arXiv Detail & Related papers (2023-10-19T13:26:02Z) - Towards Answering Health-related Questions from Medical Videos: Datasets
and Approaches [21.16331827504689]
A growing number of individuals now prefer instructional videos as they offer a series of step-by-step procedures to accomplish particular tasks.
The instructional videos from the medical domain may provide the best possible visual answers to first aid, medical emergency, and medical education questions.
The scarcity of large-scale datasets in the medical domain is a key challenge that hinders the development of applications that can help the public with their health-related questions.
arXiv Detail & Related papers (2023-09-21T16:21:28Z) - InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding
and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations.
The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z) - InternVideo: General Video Foundation Models via Generative and
Discriminative Learning [52.69422763715118]
We present general video foundation models, InternVideo, for dynamic and complex video-level understanding tasks.
InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives.
InternVideo achieves state-of-the-art performance on 39 video datasets from extensive tasks including video action recognition/detection, video-language alignment, and open-world video applications.
arXiv Detail & Related papers (2022-12-06T18:09:49Z) - How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios [73.24092762346095]
We introduce two large-scale datasets with over 60,000 videos annotated for emotional response and subjective wellbeing.
The Video Cognitive Empathy dataset contains annotations for distributions of fine-grained emotional responses, allowing models to gain a detailed understanding of affective states.
The Video to Valence dataset contains annotations of relative pleasantness between videos, which enables predicting a continuous spectrum of wellbeing.
arXiv Detail & Related papers (2022-10-18T17:58:25Z) - TL;DW? Summarizing Instructional Videos with Task Relevance &
Cross-Modal Saliency [133.75876535332003]
We focus on summarizing instructional videos, an under-explored area of video summarization.
Existing video summarization datasets rely on manual frame-level annotations.
We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer.
arXiv Detail & Related papers (2022-08-14T04:07:40Z) - A Dataset for Medical Instructional Video Classification and Question
Answering [16.748852458926162]
This paper introduces a new challenge and datasets to foster research toward designing systems that can understand medical videos.
We believe medical videos may provide the best possible answers to many first aids, medical emergency, and medical education questions.
We have benchmarked each task with the created MedVidCL and MedVidQA datasets and proposed the multimodal learning methods.
arXiv Detail & Related papers (2022-01-30T18:06:31Z) - Ultrasound Video Summarization using Deep Reinforcement Learning [12.320114045092291]
We introduce a fully automatic video summarization method tailored to the needs of medical video data.
We show that our method is superior to alternative video summarization methods and that it preserves essential information required by clinical diagnostic standards.
arXiv Detail & Related papers (2020-05-19T15:44:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.