A Comparative Study of Feature Selection in Tsetlin Machines
- URL: http://arxiv.org/abs/2508.06991v1
- Date: Sat, 09 Aug 2025 14:02:15 GMT
- Title: A Comparative Study of Feature Selection in Tsetlin Machines
- Authors: Vojtech Halenka, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen,
- Abstract summary: Feature Selection (FS) is crucial for improving model interpretability, reducing complexity, and sometimes for enhancing accuracy.<n>The recently introduced Tsetlin machine (TM) offers interpretable clause-based learning, but lacks established tools for estimating feature importance.<n>This study establishes the first comprehensive baseline for FS in TM and paves the way for developing specialized TM-specific interpretability techniques.
- Score: 14.150011713654331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature Selection (FS) is crucial for improving model interpretability, reducing complexity, and sometimes for enhancing accuracy. The recently introduced Tsetlin machine (TM) offers interpretable clause-based learning, but lacks established tools for estimating feature importance. In this paper, we adapt and evaluate a range of FS techniques for TMs, including classical filter and embedded methods as well as post-hoc explanation methods originally developed for neural networks (e.g., SHAP and LIME) and a novel family of embedded scorers derived from TM clause weights and Tsetlin automaton (TA) states. We benchmark all methods across 12 datasets, using evaluation protocols, like Remove and Retrain (ROAR) strategy and Remove and Debias (ROAD), to assess causal impact. Our results show that TM-internal scorers not only perform competitively but also exploit the interpretability of clauses to reveal interacting feature patterns. Simpler TM-specific scorers achieve similar accuracy retention at a fraction of the computational cost. This study establishes the first comprehensive baseline for FS in TM and paves the way for developing specialized TM-specific interpretability techniques.
Related papers
- Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols [123.73663884421272]
Few-shot transfer has been revolutionized by stronger pre-trained models and improved adaptation algorithms.<n>We establish FEWTRANS, a comprehensive benchmark containing 10 diverse datasets.<n>By releasing FEWTRANS, we aim to provide a rigorous "ruler" to streamline reproducible advances in few-shot transfer learning research.
arXiv Detail & Related papers (2026-02-28T05:41:57Z) - LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection [0.0]
We investigate whether large models (LLMs) can guide feature selection in a zero-shot setting, using only feature names and task descriptions, as a viable alternative to traditional approaches.<n>Our results demonstrate that LLM-guided zero-shot feature selection achieves competitive performance with traditional FS methods while offering additional advantages in interpretability, stability, and reduced dependence on labeled data.<n>These findings position zero-shot LLM-based FS as a promising alternative strategy for effective and interpretable malware detection, paving the way for knowledge-guided feature selection in security-critical applications.
arXiv Detail & Related papers (2026-02-10T10:29:34Z) - Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection [105.14032334647932]
Machine-generated texts (MGTs) pose risks such as disinformation and phishing, highlighting the need for reliable detection.<n> Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting.<n>We propose a Markov-informed score calibration strategy that models two relationships of context detection scores that may aid calibration.
arXiv Detail & Related papers (2026-02-08T16:06:12Z) - A Methodology for Transparent Logic-Based Classification Using a Multi-Task Convolutional Tsetlin Machine [5.233478578871593]
The Tsetlin Machine (TM) is a novel machine learning paradigm that employs finite-state automata for learning and utilizes propositional logic to represent patterns.<n>The Convolutional TM has shown comparable performance on various datasets such as MNIST, K-MNIST, F-MNIST and CIFAR-2.<n>We show that the TM performs competitively to this deep learning model while maintaining its interpretability, even in large-scale complex training environments.
arXiv Detail & Related papers (2025-10-02T11:25:08Z) - SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z) - Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z) - Cognitive-Mental-LLM: Evaluating Reasoning in Large Language Models for Mental Health Prediction via Online Text [0.0]
This study evaluates structured reasoning techniques-Chain-of-Thought (CoT), Self-Consistency (SC-CoT), and Tree-of-Thought (ToT)-to improve classification accuracy across multiple mental health datasets sourced from Reddit.<n>We analyze reasoning-driven prompting strategies, including Zero-shot CoT and Few-shot CoT, using key performance metrics such as Balanced Accuracy, F1 score, and Sensitivity/Specificity.<n>Our findings indicate that reasoning-enhanced techniques improve classification performance over direct prediction, particularly in complex cases.
arXiv Detail & Related papers (2025-03-13T06:42:37Z) - Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving [55.895917967408586]
Existing approaches to mathematical reasoning with large language models rely on Chain-of-Thought (CoT) for generalizability or Tool-Integrated Reasoning (TIR) for precise computation.<n>We propose TATA (Teaching LLMs According to Their Aptitude), an adaptive framework that enables LLMs to personalize their reasoning strategy spontaneously.
arXiv Detail & Related papers (2025-02-17T16:56:23Z) - DiTMoS: Delving into Diverse Tiny-Model Selection on Microcontrollers [34.282971510732736]
We introduce DiTMoS, a novel DNN training and inference framework with a selector-classifiers architecture.
A composition of weak models can exhibit high diversity and the union of them can significantly boost the accuracy upper bound.
We deploy DiTMoS on the Neucleo STM32F767ZI board and evaluate it based on three time-series datasets for human activity recognition, keywords spotting, and emotion recognition.
arXiv Detail & Related papers (2024-03-14T02:11:38Z) - Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - Machine Learning Capability: A standardized metric using case difficulty
with applications to individualized deployment of supervised machine learning [2.2060666847121864]
Model evaluation is a critical component in supervised machine learning classification analyses.
Items Response Theory (IRT) and Computer Adaptive Testing (CAT) with machine learning can benchmark datasets independent of the end-classification results.
arXiv Detail & Related papers (2023-02-09T00:38:42Z) - Mutual-Information Based Few-Shot Classification [34.95314059362982]
We introduce Transductive Infomation Maximization (TIM) for few-shot learning.
Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task.
We propose a new alternating-direction solver, which speeds up transductive inference over gradient-based optimization.
arXiv Detail & Related papers (2021-06-23T09:17:23Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.