The effect of stemming and lemmatization on Portuguese fake news text
classification
- URL: http://arxiv.org/abs/2310.11344v1
- Date: Tue, 17 Oct 2023 15:26:40 GMT
- Title: The effect of stemming and lemmatization on Portuguese fake news text
classification
- Authors: Lucca de Freitas Santos, Murilo Varges da Silva
- Abstract summary: With the popularization of the internet, smartphones and social media, information is being spread quickly and easily way.
With a bigger flow of information, some people are trying to disseminate deceptive information and fake news.
Some techniques can help to reach a good result when we are dealing with text data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the popularization of the internet, smartphones and social media,
information is being spread quickly and easily way, which implies bigger
traffic of information in the world, but there is a problem that is harming
society with the dissemination of fake news. With a bigger flow of information,
some people are trying to disseminate deceptive information and fake news. The
automatic detection of fake news is a challenging task because to obtain a good
result is necessary to deal with linguistics problems, especially when we are
dealing with languages that not have been comprehensively studied yet, besides
that, some techniques can help to reach a good result when we are dealing with
text data, although, the motivation of detecting this deceptive information it
is in the fact that the people need to know which information is true and
trustful and which one is not. In this work, we present the effect the
pre-processing methods such as lemmatization and stemming have on fake news
classification, for that we designed some classifier models applying different
pre-processing techniques. The results show that the pre-processing step is
important to obtain betters results, the stemming and lemmatization techniques
are interesting methods and need to be more studied to develop techniques
focused on the Portuguese language so we can reach better results.
Related papers
- Detection of Human and Machine-Authored Fake News in Urdu [2.013675429941823]
Social media has amplified the spread of fake news.
Traditional fake news detection methods relying on linguistic cues become less effective.
We propose a hierarchical detection strategy to improve the accuracy and robustness.
arXiv Detail & Related papers (2024-10-25T12:42:07Z) - Ethio-Fake: Cutting-Edge Approaches to Combat Fake News in Under-Resourced Languages Using Explainable AI [44.21078435758592]
Misinformation can spread quickly due to the ease of creating and disseminating content.
Traditional approaches to fake news detection often rely solely on content-based features.
We propose a comprehensive approach that integrates social context-based features with news content features.
arXiv Detail & Related papers (2024-10-03T15:49:35Z) - Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - fakenewsbr: A Fake News Detection Platform for Brazilian Portuguese [0.6775616141339018]
This paper presents a comprehensive study on detecting fake news in Brazilian Portuguese.
We propose a machine learning-based approach that leverages natural language processing techniques, including TF-IDF and Word2Vec.
We develop a user-friendly web platform, fakenewsbr.com, to facilitate the verification of news articles' veracity.
arXiv Detail & Related papers (2023-09-20T04:10:03Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - The use of Data Augmentation as a technique for improving neural network
accuracy in detecting fake news about COVID-19 [0.0]
This paper aims to present how the application of Natural Language Processing (NLP) and data augmentation techniques can improve the performance of a neural network for better detection of fake news in the Portuguese language.
arXiv Detail & Related papers (2022-05-01T11:52:53Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - A Systematic Review on the Detection of Fake News Articles [0.0]
It has been argued that fake news and the spread of false information pose a threat to societies throughout the world.
To combat this threat, a number of Natural Language Processing (NLP) approaches have been developed.
This paper aims to delineate the approaches for fake news detection that are most performant, identify limitations with existing approaches, and suggest ways these can be mitigated.
arXiv Detail & Related papers (2021-10-18T21:29:11Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z) - Machine Learning Explanations to Prevent Overtrust in Fake News
Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news.
We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms.
For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.