Related papers: Blog Data Showdown: Machine Learning vs Neuro-Symbolic Models for Gender Classification

Blog Data Showdown: Machine Learning vs Neuro-Symbolic Models for Gender Classification

URL: http://arxiv.org/abs/2512.16687v1
Date: Thu, 18 Dec 2025 15:53:10 GMT
Title: Blog Data Showdown: Machine Learning vs Neuro-Symbolic Models for Gender Classification
Authors: Natnael Tilahun Sinshaw, Mengmei He, Tadesse K. Bahiru, Sudhir Kumar Mohapatra,
Abstract summary: This study presents a comparative analysis of the widely used machine learning algorithms, namely Support Vector Machines (SVM), Naive Bayes (NB), Logistic Regression (LR), AdaBoost, XGBoost, and an SVM variant (SVM_R) with neuro-symbolic AI (NeSy)<n>The experimental results show that the use of the NeSy approach matched strong results despite a limited dataset.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text classification problems, such as gender classification from a blog, have been a well-matured research area that has been well studied using machine learning algorithms. It has several application domains in market analysis, customer recommendation, and recommendation systems. This study presents a comparative analysis of the widely used machine learning algorithms, namely Support Vector Machines (SVM), Naive Bayes (NB), Logistic Regression (LR), AdaBoost, XGBoost, and an SVM variant (SVM_R) with neuro-symbolic AI (NeSy). The paper also explores the effect of text representations such as TF-IDF, the Universal Sentence Encoder (USE), and RoBERTa. Additionally, various feature extraction techniques, including Chi-Square, Mutual Information, and Principal Component Analysis, are explored. Building on these, we introduce a comparative analysis of the machine learning and deep learning approaches in comparison to the NeSy. The experimental results show that the use of the NeSy approach matched strong MLP results despite a limited dataset. Future work on this research will expand the knowledge base, the scope of embedding types, and the hyperparameter configuration to further study the effectiveness of the NeSy approach.

Related papers

Automated Research Article Classification and Recommendation Using NLP and ML [0.5486463492959637]
This paper presents an automated framework for research article classification and recommendation.<n>We use a large-scale arXiv.org dataset spanning more than three decades.<n>To complement classification, we incorporate a recommendation module based on the cosine similarity of vectorized articles.
arXiv Detail & Related papers (2025-10-07T01:24:35Z)
Comparison of Machine Learning Models to Classify Documents on Digital Development [0.0]
This study employs a publicly available document database on worldwide digital development interventions categorised under twelve areas.<n>The study concludes that the amount of data is not the sole factor affecting the performance; features like similarity within classes and dissimilarity among classes are also crucial.
arXiv Detail & Related papers (2025-10-01T09:53:28Z)
An experimental survey and Perspective View on Meta-Learning for Automated Algorithms Selection and Parametrization [0.0]
We provide an overview of the state of the art in this continuously evolving field.<n>AutoML makes machine learning techniques accessible to domain scientists who are interested in applying advanced analytics.
arXiv Detail & Related papers (2025-04-08T16:51:22Z)
Enhancing literature review with LLM and NLP methods. Algorithmic trading case [0.0]
This study utilizes machine learning algorithms to analyze and organize knowledge in the field of algorithmic trading. By filtering a dataset of 136 million research papers, we identified 14,342 relevant articles published between 1956 and Q1 2020.
arXiv Detail & Related papers (2024-10-23T13:37:27Z)
AI-Aided Kalman Filters [65.35350122917914]
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing.<n>Recent developments illustrate the possibility of fusing deep neural networks (DNNs) with classic Kalman-type filtering.<n>This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms.
arXiv Detail & Related papers (2024-10-16T06:47:53Z)
Embedding in Recommender Systems: A Survey [54.55152033023537]
This survey presents a comprehensive analysis of advances in recommender system embedding techniques.<n>In matrix-based scenarios, collaborative filtering generates embeddings that effectively model user-item preferences.<n>We introduce emerging approaches, including AutoML, hashing techniques, and quantization methods, to enhance performance.
arXiv Detail & Related papers (2023-10-28T06:31:06Z)
Comparative Analysis of Libraries for the Sentimental Analysis [0.0]
This study is main goal is to provide a comparative comparison of libraries using machine learning methods. Five Python and R libraries NLTK, Textlob Vader, Transformers (GPT and BERT pretrained), and Tidytext will be used in the study to apply sentiment analysis techniques.
arXiv Detail & Related papers (2023-07-26T17:21:53Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Deep Representational Similarity Learning for analyzing neural signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA) DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z)
A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques. We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.