Disentangling Structure and Style: Political Bias Detection in News by
Inducing Document Hierarchy
- URL: http://arxiv.org/abs/2304.02247v2
- Date: Fri, 27 Oct 2023 11:35:04 GMT
- Title: Disentangling Structure and Style: Political Bias Detection in News by
Inducing Document Hierarchy
- Authors: Jiwoo Hong, Yejin Cho, Jaemin Jung, Jiyoung Han, James Thorne
- Abstract summary: We introduce a novel multi-head hierarchical attention model that effectively encodes the structure of long documents through a diverse ensemble of attention heads.
We demonstrate that our method overcomes this domain dependency and outperforms previous approaches for robustness and accuracy.
- Score: 8.919312558800573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address an important gap in detecting political bias in news articles.
Previous works that perform document classification can be influenced by the
writing style of each news outlet, leading to overfitting and limited
generalizability. Our approach overcomes this limitation by considering both
the sentence-level semantics and the document-level rhetorical structure,
resulting in a more robust and style-agnostic approach to detecting political
bias in news articles. We introduce a novel multi-head hierarchical attention
model that effectively encodes the structure of long documents through a
diverse ensemble of attention heads. While journalism follows a formalized
rhetorical structure, the writing style may vary by news outlet. We demonstrate
that our method overcomes this domain dependency and outperforms previous
approaches for robustness and accuracy. Further analysis and human evaluation
demonstrate the ability of our model to capture common discourse structures in
journalism. Our code is available at:
https://github.com/xfactlab/emnlp2023-Document-Hierarchy
Related papers
- DocNet: Semantic Structure in Inductive Bias Detection Models [0.4779196219827508]
In this paper, we explore an often overlooked aspect of bias detection in documents: the semantic structure of news articles.
We present DocNet, a novel, inductive, and low-resource document embedding and bias detection model.
We also demonstrate that the semantic structure of news articles from opposing partisan sides, as represented in document-level graph embeddings, have significant similarities.
arXiv Detail & Related papers (2024-06-16T14:51:12Z) - Tracking the Newsworthiness of Public Documents [107.12303391111014]
This work focuses on news coverage of local public policy in the San Francisco Bay Area by the San Francisco Chronicle.
First, we gather news articles, public policy documents and meeting recordings and link them using probabilistic relational modeling.
Second, we define a new task: newsworthiness prediction, to predict if a policy item will get covered.
arXiv Detail & Related papers (2023-11-16T10:05:26Z) - Finding Pragmatic Differences Between Disciplines [14.587150614245123]
We learn a fixed set of domain-agnostic descriptors for document sections and "retrofit" the corpus to these descriptors.
We analyze the position and ordering of these descriptors across documents to understand the relationship between discipline and structure.
Our findings lay the foundation for future work in assessing research quality, domain style transfer, and further pragmatic analysis.
arXiv Detail & Related papers (2023-09-30T00:46:14Z) - Conflicts, Villains, Resolutions: Towards models of Narrative Media
Framing [19.589945994234075]
We revisit a widely used conceptualization of framing from the communication sciences which explicitly captures elements of narratives.
We adapt an effective annotation paradigm that breaks a complex annotation task into a series of simpler binary questions.
We explore automatic multi-label prediction of our frames with supervised and semi-supervised approaches.
arXiv Detail & Related papers (2023-06-03T08:50:13Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - KCD: Knowledge Walks and Textual Cues Enhanced Political Perspective
Detection in News Media [28.813287482918344]
We propose KCD, a political perspective detection approach to enable multi-hop knowledge reasoning.
Specifically, we generate random walks on external knowledge graphs and infuse them with news text representations.
We then construct a heterogeneous information network to jointly model news content as well as semantic, syntactic and entity cues in news articles.
arXiv Detail & Related papers (2022-04-08T13:06:09Z) - Analyzing Political Bias and Unfairness in News Articles at Different
Levels of Granularity [35.19976910093135]
The research presented in this paper addresses not only the automatic detection of bias but goes one step further in that it explores how political bias and unfairness are manifested linguistically.
We utilize a new corpus of 6964 news articles with labels derived from adfontesmedia.com and develop a neural model for bias assessment.
arXiv Detail & Related papers (2020-10-20T22:25:00Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z) - Multilevel Text Alignment with Cross-Document Attention [59.76351805607481]
Existing alignment methods operate at a single, predefined level.
We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component.
arXiv Detail & Related papers (2020-10-03T02:52:28Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.