Distinguishing Commercial from Editorial Content in News
- URL: http://arxiv.org/abs/2111.03916v1
- Date: Sat, 6 Nov 2021 16:45:48 GMT
- Title: Distinguishing Commercial from Editorial Content in News
- Authors: Timo Kats, Peter van der Putten and Jasper Schelling
- Abstract summary: We aim to differentiate the two using a machine learning model, and a lexicon derived from it.
This was accomplished by scraping 1.000 articles and 1.000 advertorials from four different Dutch news sources.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: How can we distinguish commercial from editorial content in news, or more
specifically, differentiate between advertorials and regular news articles? An
advertorial is a commercial message written and formatted as an article, making
it harder for readers to recognize these as advertising, despite the use of
disclaimers. In our research we aim to differentiate the two using a machine
learning model, and a lexicon derived from it. This was accomplished by
scraping 1.000 articles and 1.000 advertorials from four different Dutch news
sources and classifying these based on textual features. With this setup our
most successful machine learning model had an accuracy of just over $90\%$. To
generate additional insights into differences between news and advertorial
language, we also analyzed model coefficients and explored the corpus through
co-occurrence networks and t-SNE graphs.
Related papers
- TieFake: Title-Text Similarity and Emotion-Aware Fake News Detection [15.386007761649251]
We propose a novel Title-Text similarity and emotion-aware Fake news detection (TieFake) method by jointly modeling the multi-modal context information and the author sentiment.
Specifically, we employ BERT and ResNeSt to learn the representations for text and images, and utilize publisher emotion extractor to capture the author's subjective emotion in the news content.
arXiv Detail & Related papers (2023-04-19T04:47:36Z) - Persuasion Strategies in Advertisements [68.70313043201882]
We introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies.
We then formulate the task of persuasion strategy prediction with multi-modal learning.
We conduct a real-world case study on 1600 advertising campaigns of 30 Fortune-500 companies.
arXiv Detail & Related papers (2022-08-20T07:33:13Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - Leveraging Declarative Knowledge in Text and First-Order Logic for
Fine-Grained Propaganda Detection [139.3415751957195]
We study the detection of propagandistic text fragments in news articles.
We introduce an approach to inject declarative knowledge of fine-grained propaganda techniques.
arXiv Detail & Related papers (2020-04-29T13:46:15Z) - Transform and Tell: Entity-Aware News Image Captioning [77.4898875082832]
We propose an end-to-end model which generates captions for images embedded in news articles.
We address the first challenge by associating words in the caption with faces and objects in the image, via a multi-modal, multi-head attention mechanism.
We tackle the second challenge with a state-of-the-art transformer language model that uses byte-pair-encoding to generate captions as a sequence of word parts.
arXiv Detail & Related papers (2020-04-17T05:44:37Z) - BaitWatcher: A lightweight web interface for the detection of
incongruent news headlines [27.29585619643952]
BaitWatcher is a lightweight web interface that guides readers in estimating the likelihood of incongruence in news articles before clicking on the headlines.
BaiittWatcher utilizes a hierarchical recurrent encoder that efficiently learns complex textual representations of a news headline and its associated body text.
arXiv Detail & Related papers (2020-03-23T23:43:02Z) - SirenLess: reveal the intention behind news [31.757138364005087]
We present SirenLess, a visual analytical system for misleading news detection by linguistic features.
The system features article explorer, a novel interactive tool that integrates news metadata and linguistic features to reveal semantic structures of news articles.
We use SirenLess to analyze 18 news articles from different sources and summarize some helpful patterns for misleading news detection.
arXiv Detail & Related papers (2020-01-08T20:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.