Bangla BERT for Hyperpartisan News Detection: A Semi-Supervised and Explainable AI Approach
- URL: http://arxiv.org/abs/2507.21242v1
- Date: Mon, 28 Jul 2025 18:02:01 GMT
- Title: Bangla BERT for Hyperpartisan News Detection: A Semi-Supervised and Explainable AI Approach
- Authors: Mohammad Mehadi Hasan, Fatema Binte Hassan, Md Al Jubair, Zobayer Ahmed, Sazzatul Yeakin, Md Masum Billah,
- Abstract summary: State-of-the-art transformer-based model designed to enhance classification accuracy for hyperpartisan news.<n>With a remarkable accuracy score of 95.65%, Bangla BERT outperforms conventional approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the current digital landscape, misinformation circulates rapidly, shaping public perception and causing societal divisions. It is difficult to identify hyperpartisan news in Bangla since there aren't many sophisticated natural language processing methods available for this low-resource language. Without effective detection methods, biased content can spread unchecked, posing serious risks to informed discourse. To address this gap, our research fine-tunes Bangla BERT. This is a state-of-the-art transformer-based model, designed to enhance classification accuracy for hyperpartisan news. We evaluate its performance against traditional machine learning models and implement semi-supervised learning to enhance predictions further. Not only that, we use LIME to provide transparent explanations of the model's decision-making process, which helps to build trust in its outcomes. With a remarkable accuracy score of 95.65%, Bangla BERT outperforms conventional approaches, according to our trial data. The findings of this study demonstrate the usefulness of transformer models even in environments with limited resources, which opens the door to further improvements in this area.
Related papers
- A Meaningful Perturbation Metric for Evaluating Explainability Methods [55.09730499143998]
We introduce a novel approach, which harnesses image generation models to perform targeted perturbation.<n> Specifically, we focus on inpainting only the high-relevance pixels of an input image to modify the model's predictions while preserving image fidelity.<n>This is in contrast to existing approaches, which often produce out-of-distribution modifications, leading to unreliable results.
arXiv Detail & Related papers (2025-04-09T11:46:41Z) - Breaking the Fake News Barrier: Deep Learning Approaches in Bangla Language [0.0]
This ponder presents a strategy that utilizes a profound learning innovation, particularly the Gated Repetitive Unit (GRU) to recognize fake news within the Bangla dialect.<n>The strategy of our proposed work incorporates intensive information preprocessing, which includes tlemmaization, tokenization, and tending to course awkward nature by oversampling.<n>The performance of the model is investigated by reliable metrics like precision, recall, F1 score, and accuracy.
arXiv Detail & Related papers (2025-01-30T21:41:26Z) - A Regularized LSTM Method for Detecting Fake News Articles [0.0]
This paper develops an advanced machine learning solution for detecting fake news articles.
We leverage a comprehensive dataset of news articles, including 23,502 fake news articles and 21,417 accurate news articles.
Our work highlights the potential for deploying such models in real-world applications.
arXiv Detail & Related papers (2024-11-16T05:54:36Z) - Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.<n>FAST surpasses state-of-the-art baselines with superior debiasing performance.<n>This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings [0.0]
This study introduces a well-grounded approach to identify depressive social media posts in Bangla.
The dataset used in this work, annotated by domain experts, includes both depressive and non-depressive posts.
To address the issue of class imbalance, we utilised random oversampling for the minority class.
arXiv Detail & Related papers (2024-07-12T11:40:17Z) - Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News
Detection [50.07850264495737]
"Prompt-and-Align" (P&A) is a novel prompt-based paradigm for few-shot fake news detection.
We show that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.
arXiv Detail & Related papers (2023-09-28T13:19:43Z) - Performance Analysis of Transformer Based Models (BERT, ALBERT and
RoBERTa) in Fake News Detection [0.0]
Top three areas most exposed to hoaxes and misinformation by residents are in Banten, DKI Jakarta and West Java.
Previous study indicates a superior performance of a transformer model known as BERT over and above non transformer approach.
In this research, we explore those transformer models and found that ALBERT outperformed other models with 87.6% accuracy, 86.9% precision, 86.9% F1-score, and 174.5 run-time (s/epoch) respectively.
arXiv Detail & Related papers (2023-08-09T13:33:27Z) - Interpretable Fake News Detection with Topic and Deep Variational Models [2.15242029196761]
We focus on fake news detection using interpretable features and methods.
We have developed a deep probabilistic model that integrates a dense representation of textual news.
Our model achieves comparable performance to state-of-the-art competing models.
arXiv Detail & Related papers (2022-09-04T05:31:00Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.