Incongruity Detection between Bangla News Headline and Body Content
through Graph Neural Network
- URL: http://arxiv.org/abs/2211.07709v1
- Date: Wed, 26 Oct 2022 20:57:45 GMT
- Title: Incongruity Detection between Bangla News Headline and Body Content
through Graph Neural Network
- Authors: Md Aminul Haque Palash, Akib Khan, Kawsarul Islam, MD Abdullah Al
Nasim, Ryan Mohammad Bin Shahjahan
- Abstract summary: Incongruity between news headlines and body content is a common method of deception used to attract readers.
We propose a graph-based hierarchical dual encoder model that learns the content similarity and contradiction between Bangla news headlines and content paragraphs effectively.
The proposed Bangla graph-based neural network model achieves above 90% accuracy on various Bangla news datasets.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Incongruity between news headlines and the body content is a common method of
deception used to attract readers. Profitable headlines pique readers' interest
and encourage them to visit a specific website. This is usually done by adding
an element of dishonesty, using enticements that do not precisely reflect the
content being delivered. As a result, automatic detection of incongruent news
between headline and body content using language analysis has gained the
research community's attention. However, various solutions are primarily being
developed for English to address this problem, leaving low-resource languages
out of the picture. Bangla is ranked 7th among the top 100 most widely spoken
languages, which motivates us to pay special attention to the Bangla language.
Furthermore, Bangla has a more complex syntactic structure and fewer natural
language processing resources, so it becomes challenging to perform NLP tasks
like incongruity detection and stance detection. To tackle this problem, for
the Bangla language, we offer a graph-based hierarchical dual encoder (BGHDE)
model that learns the content similarity and contradiction between Bangla news
headlines and content paragraphs effectively. The experimental results show
that the proposed Bangla graph-based neural network model achieves above 90%
accuracy on various Bangla news datasets.
Related papers
- BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models
for Sentiment Analysis of Bangla Social Media Posts [0.46040036610482665]
This paper presents our submission to Task 2 (Sentiment Analysis of Bangla Social Media Posts) of the BLP Workshop.
Our quantitative results show that transfer learning really helps in better learning of the models in this low-resource language scenario.
We obtain a micro-F1 of 67.02% on the test set and our performance in this shared task is ranked at 21 in the leaderboard.
arXiv Detail & Related papers (2023-10-13T16:46:38Z) - NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - On Evaluation of Bangla Word Analogies [0.8658596218544772]
This paper presents a high-quality dataset for evaluating the quality of Bangla word embeddings.
Despite being the 7th most-spoken language in the world, Bangla is a low-resource language and popular NLP models fail to perform well.
arXiv Detail & Related papers (2023-04-10T14:27:35Z) - Utilizing Wordnets for Cognate Detection among Indian Languages [50.83320088758705]
We detect cognate word pairs among ten Indian languages with Hindi.
We use deep learning methodologies to predict whether a word pair is cognate or not.
We report improved performance of up to 26%.
arXiv Detail & Related papers (2021-12-30T16:46:28Z) - Harnessing Cross-lingual Features to Improve Cognate Detection for
Low-resource Languages [50.82410844837726]
We demonstrate the use of cross-lingual word embeddings for detecting cognates among fourteen Indian languages.
We evaluate our methods to detect cognates on a challenging dataset of twelve Indian languages.
We observe an improvement of up to 18% points, in terms of F-score, for cognate detection.
arXiv Detail & Related papers (2021-12-16T11:17:58Z) - Fine-Grained Image Generation from Bangla Text Description using
Attentional Generative Adversarial Network [0.0]
We propose Bangla Attentional Generative Adversarial Network (AttnGAN) that allows intensified, multi-stage processing for high-resolution Bangla text-to-image generation.
For the first time, a fine-grained image is generated from Bangla text using attentional GAN.
arXiv Detail & Related papers (2021-09-24T05:31:01Z) - LadRa-Net: Locally-Aware Dynamic Re-read Attention Net for Sentence
Semantic Matching [66.65398852962177]
We develop a novel Dynamic Re-read Network (DRr-Net) for sentence semantic matching.
We extend DRr-Net to Locally-Aware Dynamic Re-read Attention Net (LadRa-Net)
Experiments on two popular sentence semantic matching tasks demonstrate that DRr-Net can significantly improve the performance of sentence semantic matching.
arXiv Detail & Related papers (2021-08-06T02:07:04Z) - End-to-End Natural Language Understanding Pipeline for Bangla
Conversational Agents [0.43012765978447565]
We propose a novel approach to build a business assistant which can communicate in Bangla and Bangla Transliteration in English with high confidence consistently.
We use Rasa Open Source Framework, fastText embeddings, Polyglot embeddings, Flask, and other systems as building blocks.
We present a pipeline for intent classification and entity extraction which achieves reasonable performance.
arXiv Detail & Related papers (2021-07-12T16:09:22Z) - Bangla Text Classification using Transformers [2.3475904942266697]
Text classification has been one of the earliest problems in NLP.
In this work, we fine-tune multilingual Transformer models for Bangla text classification tasks.
We obtain the state of the art results on six benchmark datasets, improving upon the previous results by 5-29% accuracy across different tasks.
arXiv Detail & Related papers (2020-11-09T14:12:07Z) - Intrinsic Probing through Dimension Selection [69.52439198455438]
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it.
In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
arXiv Detail & Related papers (2020-10-06T15:21:08Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.