Related papers: Inference of Media Bias and Content Quality Using Natural-Language Processing

Inference of Media Bias and Content Quality Using Natural-Language Processing

URL: http://arxiv.org/abs/2212.00237v1
Date: Thu, 1 Dec 2022 03:04:55 GMT
Title: Inference of Media Bias and Content Quality Using Natural-Language Processing
Authors: Zehan Chao, Denali Molitor, Deanna Needell, and Mason A. Porter
Abstract summary: We present a framework to infer both political bias and content quality of media outlets from text. We apply a bidirectional long short-term memory (LSTM) neural network to a data set of more than 1 million tweets. Our results illustrate the importance of leveraging word order into machine-learning methods in text analysis.
Score: 6.092956184948962
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Media bias can significantly impact the formation and development of opinions and sentiments in a population. It is thus important to study the emergence and development of partisan media and political polarization. However, it is challenging to quantitatively infer the ideological positions of media outlets. In this paper, we present a quantitative framework to infer both political bias and content quality of media outlets from text, and we illustrate this framework with empirical experiments with real-world data. We apply a bidirectional long short-term memory (LSTM) neural network to a data set of more than 1 million tweets to generate a two-dimensional ideological-bias and content-quality measurement for each tweet. We then infer a ``media-bias chart'' of (bias, quality) coordinates for the media outlets by integrating the (bias, quality) measurements of the tweets of the media outlets. We also apply a variety of baseline machine-learning methods, such as a naive-Bayes method and a support-vector machine (SVM), to infer the bias and quality values for each tweet. All of these baseline approaches are based on a bag-of-words approach. We find that the LSTM-network approach has the best performance of the examined methods. Our results illustrate the importance of leveraging word order into machine-learning methods in text analysis.

Related papers

Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts [29.95198868148809]
We propose a novel methodology that emulates the criteria that professional fact-checkers use to assess the factuality and political bias of an entire outlet.<n>We provide an in-depth error analysis of the effect of media popularity and region on model performance.
arXiv Detail & Related papers (2025-06-14T15:49:20Z)
LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning [58.98865450345401]
We introduce LecEval, an automated metric grounded in Mayer's Cognitive Theory of Multimedia Learning.<n>LecEval assesses effectiveness using four rubrics: Content Relevance (CR), Expressive Clarity (EC), Logical Structure (LS) and Audience Engagement (AE)<n>We curate a large-scale dataset of over 2,000 slides from more than 50 online course videos, annotated with fine-grained human ratings.
arXiv Detail & Related papers (2025-05-04T12:06:47Z)
Unraveling Media Perspectives: A Comprehensive Methodology Combining Large Language Models, Topic Modeling, Sentiment Analysis, and Ontology Learning to Analyse Media Bias [0.0]
This study introduces a novel methodology for scalable, minimally biased analysis of media bias in political news.<n>The proposed approach examines event selection, labeling, word choice, and commission and omission biases across news sources.
arXiv Detail & Related papers (2025-05-03T09:09:34Z)
Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions [0.7249731529275342]
We propose an extension to a recently presented news media reliability estimation method. We assess the classification performance of four reinforcement learning strategies on a large news media hyperlink graph. Our experiments, targeting two challenging bias descriptors, factual reporting and political bias, showed a significant performance improvement at the source media level.
arXiv Detail & Related papers (2024-10-23T08:18:26Z)
Modeling Political Orientation of Social Media Posts: An Extended Analysis [0.0]
Developing machine learning models to characterize political polarization on online social media presents significant challenges. These challenges mainly stem from various factors such as the lack of annotated data, presence of noise in social media datasets, and the sheer volume of data. We introduce two methods that leverage on news media bias and post content to label social media posts. We demonstrate that current machine learning models can exhibit improved performance in predicting political orientation of social media posts.
arXiv Detail & Related papers (2023-11-21T03:34:20Z)
Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection [24.35462897801079]
We introduce the Media Bias Identification Benchmark (MBIB) to group different types of media bias under a common framework. After reviewing 115 datasets, we select nine tasks and carefully propose 22 associated datasets for evaluating media bias detection techniques. Our results suggest that while hate speech, racial bias, and gender bias are easier to detect, models struggle to handle certain bias types, e.g., cognitive and political bias.
arXiv Detail & Related papers (2023-04-25T20:49:55Z)
Towards Corpus-Scale Discovery of Selection Biases in News Coverage: Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora. We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z)
Bias or Diversity? Unraveling Fine-Grained Thematic Discrepancy in U.S. News Headlines [63.52264764099532]
We use a large dataset of 1.8 million news headlines from major U.S. media outlets spanning from 2014 to 2022. We quantify the fine-grained thematic discrepancy related to four prominent topics - domestic politics, economic issues, social issues, and foreign affairs. Our findings indicate that on domestic politics and social issues, the discrepancy can be attributed to a certain degree of media bias.
arXiv Detail & Related papers (2023-03-28T03:31:37Z)
Computational Assessment of Hyperpartisanship in News Titles [55.92100606666497]
We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection. Overall the Right media tends to use proportionally more hyperpartisan titles. We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles.
arXiv Detail & Related papers (2023-01-16T05:56:58Z)
GREENER: Graph Neural Networks for News Media Profiling [24.675574340841163]
We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. Our main focus is on modeling the similarity between media outlets based on the overlap of their audience. Prediction accuracy is found to improve by 2.5-27 macro-F1 points for the two tasks.
arXiv Detail & Related papers (2022-11-10T12:46:29Z)
Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic. Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z)
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis. We learn sentiment, aspect> joint topic embeddings in the word embedding space. We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
Deep Learning Techniques for Future Intelligent Cross-Media Retrieval [58.20547387332133]
Cross-media retrieval plays a significant role in big data applications. We provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches. We present some well-known cross-media datasets used for retrieval.
arXiv Detail & Related papers (2020-07-21T09:49:33Z)
A multi-layer approach to disinformation detection on Twitter [4.663548775064491]
We employ a multi-layer representation of Twitter diffusion networks, and we compute for each layer a set of global network features. Experimental results with two large-scale datasets, corresponding to diffusion cascades of news shared respectively in the United States and Italy, show that a simple Logistic Regression model is able to classify disinformation vs mainstream networks with high accuracy. We believe that our network-based approach provides useful insights which pave the way to the future development of a system to detect misleading and harmful information spreading on social media.
arXiv Detail & Related papers (2020-02-28T09:25:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.