Leveraging ChatGPT for Sponsored Ad Detection and Keyword Extraction in YouTube Videos
- URL: http://arxiv.org/abs/2502.15102v1
- Date: Thu, 20 Feb 2025 23:44:15 GMT
- Title: Leveraging ChatGPT for Sponsored Ad Detection and Keyword Extraction in YouTube Videos
- Authors: Brice Valentin Kok-Shun, Johnny Chan,
- Abstract summary: This work-in-progress paper presents a novel approach to detecting sponsored advertisement segments in YouTube videos.<n>Our methodology involves the collection of 421 auto-generated and manual transcripts which are fed into a prompt-engineered GPT-4o for ad detection.<n>The results revealed a significant prevalence of product-related ads across various educational topics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work-in-progress paper presents a novel approach to detecting sponsored advertisement segments in YouTube videos and comparing the advertisement with the main content. Our methodology involves the collection of 421 auto-generated and manual transcripts which are then fed into a prompt-engineered GPT-4o for ad detection, a KeyBERT for keyword extraction, and another iteration of ChatGPT for category identification. The results revealed a significant prevalence of product-related ads across various educational topics, with ad categories refined using GPT-4o into succinct 9 content and 4 advertisement categories. This approach provides a scalable and efficient alternative to traditional ad detection methods while offering new insights into the types and relevance of ads embedded within educational content. This study highlights the potential of LLMs in transforming ad detection processes and improving our understanding of advertisement strategies in digital media.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda
Spans in News Articles [11.64165958410489]
We develop the largest propaganda dataset to date, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques.
Our work offers the first attempt to understand the performance of large language models (LLMs), using GPT-4, for fine-grained propaganda detection from text.
Results showed that GPT-4's performance degrades as the task moves from simply classifying a paragraph as propagandistic or not, to the fine-grained task of detecting propaganda techniques and their manifestation in text.
arXiv Detail & Related papers (2024-02-27T13:02:19Z) - Large Language Models for Propaganda Span Annotation [10.358271919023903]
This study investigates whether Large Language Models, such as GPT-4, can effectively extract propagandistic spans.
The experiments are performed over a large-scale in-house manually annotated dataset.
arXiv Detail & Related papers (2023-11-16T11:37:54Z) - Long-Term Ad Memorability: Understanding & Generating Memorable Ads [54.23854539909078]
Despite the importance of long-term memory in marketing and brand building, until now, there has been no large-scale study on the memorability of ads.<n>We release the first memorability dataset, LAMBDA, consisting of 1749 participants and 2205 ads covering 276 brands.<n>Running statistical tests over different participant subpopulations and ad types, we find many interesting insights into what makes an ad memorable, e.g., fast-moving ads are more memorable than those with slower scenes.<n>We present a scalable method to build a high-quality memorable ad generation model by leveraging automatically annotated data.
arXiv Detail & Related papers (2023-09-01T10:27:04Z) - Persuasion Strategies in Advertisements [68.70313043201882]
We introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies.
We then formulate the task of persuasion strategy prediction with multi-modal learning.
We conduct a real-world case study on 1600 advertising campaigns of 30 Fortune-500 companies.
arXiv Detail & Related papers (2022-08-20T07:33:13Z) - Distinguishing Commercial from Editorial Content in News [0.0]
We aim to differentiate the two using a machine learning model, and a lexicon derived from it.
This was accomplished by scraping 1.000 articles and 1.000 advertorials from four different Dutch news sources.
arXiv Detail & Related papers (2021-11-06T16:45:48Z) - Cross-category Video Highlight Detection via Set-based Learning [55.49267044910344]
We propose a Dual-Learner-based Video Highlight Detection (DL-VHD) framework.
It learns the distinction of target category videos and the characteristics of highlight moments on source video category.
It outperforms five typical Unsupervised Domain Adaptation (UDA) algorithms on various cross-category highlight detection tasks.
arXiv Detail & Related papers (2021-08-26T13:06:47Z) - Predicting Online Video Advertising Effects with Multimodal Deep
Learning [33.20913249848369]
We propose a method for predicting the click through rate (CTR) of video advertisements and analyzing the factors that determine the CTR.
In this paper, we demonstrate an optimized framework for accurately predicting the effects by taking advantage of the multimodal nature of online video advertisements.
arXiv Detail & Related papers (2020-12-22T06:24:01Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - Learning to Create Better Ads: Generation and Ranking Approaches for Ad
Creative Refinement [26.70647666598025]
We study approaches to refine the given ad text and image by: (i) generating new ad text, (ii) recommending keyphrases for new ad text, and (iii) recommending image tags (objects in image)
Based on A/B tests conducted by multiple advertisers, we form pairwise examples of inferior and superior ad creatives.
We also share broadly applicable insights from our experiments using data from the Yahoo Gemini ad platform.
arXiv Detail & Related papers (2020-08-17T16:46:28Z) - A4 : Evading Learning-based Adblockers [44.149991991963795]
A4 is a tool that crafts adversarial samples of ads to evade AdGraph.
We show that A4 can bypass AdGraph about 60% of the time.
We envision the algorithmic framework proposed in A4 is also promising in improving adversarial attacks against other learning-based web applications.
arXiv Detail & Related papers (2020-01-29T18:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.