Exploring Text Representations for Online Misinformation
- URL: http://arxiv.org/abs/2412.18618v1
- Date: Fri, 13 Dec 2024 20:22:36 GMT
- Title: Exploring Text Representations for Online Misinformation
- Authors: Martins Samuel Dogo,
- Abstract summary: Mis- and disinformation, collectively called fake news, continue to menace society.
This thesis contributes to the creation of representations that are useful for detecting misinformation.
It demonstrates the effectiveness of topic features for fake news detection, using classification and clustering.
- Score: 0.0
- License:
- Abstract: Mis- and disinformation, commonly collectively called fake news, continue to menace society. Perhaps, the impact of this age-old problem is presently most plain in politics and healthcare. However, fake news is affecting an increasing number of domains. It takes many different forms and continues to shapeshift as technology advances. Though it arguably most widely spreads in textual form, e.g., through social media posts and blog articles. Thus, it is imperative to thwart the spread of textual misinformation, which necessitates its initial detection. This thesis contributes to the creation of representations that are useful for detecting misinformation. Firstly, it develops a novel method for extracting textual features from news articles for misinformation detection. These features harness the disparity between the thematic coherence of authentic and false news stories. In other words, the composition of themes discussed in both groups significantly differs as the story progresses. Secondly, it demonstrates the effectiveness of topic features for fake news detection, using classification and clustering. Clustering is particularly useful because it alleviates the need for a labelled dataset, which can be labour-intensive and time-consuming to amass. More generally, it contributes towards a better understanding of misinformation and ways of detecting it using Machine Learning and Natural Language Processing.
Related papers
- Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - The effect of stemming and lemmatization on Portuguese fake news text
classification [0.0]
With the popularization of the internet, smartphones and social media, information is being spread quickly and easily way.
With a bigger flow of information, some people are trying to disseminate deceptive information and fake news.
Some techniques can help to reach a good result when we are dealing with text data.
arXiv Detail & Related papers (2023-10-17T15:26:40Z) - TieFake: Title-Text Similarity and Emotion-Aware Fake News Detection [15.386007761649251]
We propose a novel Title-Text similarity and emotion-aware Fake news detection (TieFake) method by jointly modeling the multi-modal context information and the author sentiment.
Specifically, we employ BERT and ResNeSt to learn the representations for text and images, and utilize publisher emotion extractor to capture the author's subjective emotion in the news content.
arXiv Detail & Related papers (2023-04-19T04:47:36Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Supporting verification of news articles with automated search for
semantically similar articles [0.0]
We propose an evidence retrieval approach to handle fake news.
The learning task is formulated as an unsupervised machine learning problem.
We find that our approach is agnostic to concept drifts, i.e. the machine learning task is independent of the hypotheses in a text.
arXiv Detail & Related papers (2021-03-29T12:56:59Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z) - BaitWatcher: A lightweight web interface for the detection of
incongruent news headlines [27.29585619643952]
BaitWatcher is a lightweight web interface that guides readers in estimating the likelihood of incongruence in news articles before clicking on the headlines.
BaiittWatcher utilizes a hierarchical recurrent encoder that efficiently learns complex textual representations of a news headline and its associated body text.
arXiv Detail & Related papers (2020-03-23T23:43:02Z) - SAFE: Similarity-Aware Multi-Modal Fake News Detection [8.572654816871873]
We propose a new method to detect fake news based on its text, images, or their "mismatches"
Such representations of news textual and visual information along with their relationship are jointly learned and used to predict fake news.
We conduct extensive experiments on large-scale real-world data, which demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-02-19T02:51:04Z) - Mining Disinformation and Fake News: Concepts, Methods, and Recent
Advancements [55.33496599723126]
disinformation including fake news has become a global phenomenon due to its explosive growth.
Despite the recent progress in detecting disinformation and fake news, it is still non-trivial due to its complexity, diversity, multi-modality, and costs of fact-checking or annotation.
arXiv Detail & Related papers (2020-01-02T21:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.