ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media
- URL: http://arxiv.org/abs/2305.14225v2
- Date: Wed, 12 Jun 2024 06:25:15 GMT
- Title: ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media
- Authors: Kung-Hsiang Huang, Hou Pong Chan, Kathleen McKeown, Heng Ji,
- Abstract summary: We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
- Score: 74.93847489218008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Considerable advancements have been made to tackle the misrepresentation of information derived from reference articles in the domains of fact-checking and faithful summarization. However, an unaddressed aspect remains - the identification of social media posts that manipulate information within associated news articles. This task presents a significant challenge, primarily due to the prevalence of personal opinions in such posts. We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information. To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles. Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance. Additionally, we have developed a simple yet effective basic model that outperforms LLMs significantly on the ManiTweet dataset. Finally, we have conducted an exploratory analysis of human-written tweets, unveiling intriguing connections between manipulation and the domain and factuality of news articles, as well as revealing that manipulated sentences are more likely to encapsulate the main story or consequences of a news outlet.
Related papers
- A Semi-supervised Fake News Detection using Sentiment Encoding and LSTM with Self-Attention [0.0]
We propose a semi-supervised self-learning method in which a sentiment analysis is acquired by some state-of-the-art pretrained models.
Our learning model is trained in a semi-supervised fashion and incorporates LSTM with self-attention layers.
We benchmark our model on a dataset with 20,000 news content along with their feedback, which shows better performance in precision, recall, and measures compared to competitive methods in fake news detection.
arXiv Detail & Related papers (2024-07-27T20:00:10Z) - Countering Misinformation via Emotional Response Generation [15.383062216223971]
proliferation of misinformation on social media platforms (SMPs) poses a significant danger to public health, social cohesion and democracy.
Previous research has shown how social correction can be an effective way to curb misinformation.
We present VerMouth, the first large-scale dataset comprising roughly 12 thousand claim-response pairs.
arXiv Detail & Related papers (2023-11-17T15:37:18Z) - Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News
Detection [50.07850264495737]
"Prompt-and-Align" (P&A) is a novel prompt-based paradigm for few-shot fake news detection.
We show that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.
arXiv Detail & Related papers (2023-09-28T13:19:43Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Applying Automatic Text Summarization for Fake News Detection [4.2177790395417745]
The distribution of fake news is not a new but a rapidly growing problem.
We present an approach to the problem that combines the power of transformer-based language models.
Our framework, CMTR-BERT, combines multiple text representations and enables the incorporation of contextual information.
arXiv Detail & Related papers (2022-04-04T21:00:55Z) - Improved Topic modeling in Twitter through Community Pooling [0.0]
Twitter posts are short and often less coherent than other text documents.
We propose a new pooling scheme for topic modeling in Twitter, which groups tweets whose authors belong to the same community.
Results show that our Community polling method outperformed other methods on the majority of metrics in two heterogeneous datasets.
arXiv Detail & Related papers (2021-12-20T17:05:32Z) - A Study of Fake News Reading and Annotating in Social Media Context [1.0499611180329804]
We present an eye-tracking study, in which we let 44 lay participants to casually read through a social media feed containing posts with news articles, some of which were fake.
In a second run, we asked the participants to decide on the truthfulness of these articles.
We also describe a follow-up qualitative study with a similar scenario but this time with 7 expert fake news annotators.
arXiv Detail & Related papers (2021-09-26T08:11:17Z) - Machine Learning Explanations to Prevent Overtrust in Fake News
Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news.
We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms.
For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z) - Leveraging Multi-Source Weak Social Supervision for Early Detection of
Fake News [67.53424807783414]
Social media has greatly enabled people to participate in online activities at an unprecedented rate.
This unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation.
We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances.
Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
arXiv Detail & Related papers (2020-04-03T18:26:33Z) - Mining Disinformation and Fake News: Concepts, Methods, and Recent
Advancements [55.33496599723126]
disinformation including fake news has become a global phenomenon due to its explosive growth.
Despite the recent progress in detecting disinformation and fake news, it is still non-trivial due to its complexity, diversity, multi-modality, and costs of fact-checking or annotation.
arXiv Detail & Related papers (2020-01-02T21:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.