Related papers: Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information

Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information

URL: http://arxiv.org/abs/2407.05464v1
Date: Sun, 7 Jul 2024 18:31:09 GMT
Title: Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information
Authors: Vishnu S. Pendyala, Madhulika Dutta,
Abstract summary: This paper analyzes synthetic, false, and genuine information in the form of text from spectral analysis, visualization, and explainability perspectives. Various embedding techniques on multiple datasets are used to represent information. Classification is done using multiple machine learning algorithms.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Misinformation is still a major societal problem and the arrival of Large Language Models (LLMs) only added to it. This paper analyzes synthetic, false, and genuine information in the form of text from spectral analysis, visualization, and explainability perspectives to find the answer to why the problem is still unsolved despite multiple years of research and a plethora of solutions in the literature. Various embedding techniques on multiple datasets are used to represent information for the purpose. The diverse spectral and non-spectral methods used on these embeddings include t-distributed Stochastic Neighbor Embedding (t-SNE), Principal Component Analysis (PCA), and Variational Autoencoders (VAEs). Classification is done using multiple machine learning algorithms. Local Interpretable Model-Agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and Integrated Gradients are used for the explanation of the classification. The analysis and the explanations generated show that misinformation is quite closely intertwined with genuine information and the machine learning algorithms are not as effective in separating the two despite the claims in the literature.

Related papers

Unmasking Digital Falsehoods: A Comparative Analysis of LLM-Based Misinformation Detection Strategies [0.0]
This paper conducts a comparison of approaches to detecting misinformation between text-based, multimodal, and agentic approaches. We evaluate the effectiveness of fine-tuned models, zero-shot learning, and systematic fact-checking mechanisms in detecting misinformation across different topic domains.
arXiv Detail & Related papers (2025-03-02T04:31:42Z)
DEMAU: Decompose, Explore, Model and Analyse Uncertainties [0.8287206589886881]
DEMAU is an open-source educational, exploratory and analytical tool allowing to visualize and explore several types of uncertainty for classification models in machine learning.
arXiv Detail & Related papers (2024-09-12T14:57:28Z)
A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection [0.0]
We study the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, and the implausible utility of explanations. Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees. We find that feature-based model explanations are most often inconsistent across different settings.
arXiv Detail & Related papers (2024-07-04T15:35:42Z)
Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data. One straightforward solution is to integrate statistical analysis and machine learning. Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions. This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z)
Azimuth: Systematic Error Analysis for Text Classification [3.1679600401346706]
Azimuth is an open-source tool to perform error analysis for text classification. We propose an approach comprising dataset analysis and model quality assessment.
arXiv Detail & Related papers (2022-12-16T01:10:41Z)
No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling [0.0]
Machine learning algorithms can be used to extract knowledge from unlabeled texts. Unsupervised learning can lead to variability depending on the machine learning algorithm. The presence of outliers and anomalies can be a determining factor.
arXiv Detail & Related papers (2022-08-02T19:51:43Z)
Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms. Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications. By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
Visualisation and knowledge discovery from interpretable models [0.0]
We introduce a few intrinsically interpretable models which are also capable of dealing with missing values. We have demonstrated the algorithms on a synthetic dataset and a real-world one.
arXiv Detail & Related papers (2020-05-07T17:37:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.