QCBA: Improving Rule Classifiers Learned from Quantitative Data by
Recovering Information Lost by Discretisation
- URL: http://arxiv.org/abs/1711.10166v3
- Date: Fri, 2 Jun 2023 13:31:59 GMT
- Title: QCBA: Improving Rule Classifiers Learned from Quantitative Data by
Recovering Information Lost by Discretisation
- Authors: Tomas Kliegr, Ebroul Izquierdo
- Abstract summary: This paper describes new rule tuning steps that aim to recover lost information in the discretisation and new pruning techniques.
The proposed QCBA method was initially developed to postprocess quantitative attributes in models generated by the Classification based on associations (CBA) algorithm.
Benchmarks on 22 datasets from the UCI repository show smaller size and the overall best predictive performance for FOIL2+QCBA compared to all seven baselines.
- Score: 5.667821885065119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A prediscretisation of numerical attributes which is required by some rule
learning algorithms is a source of inefficiencies. This paper describes new
rule tuning steps that aim to recover lost information in the discretisation
and new pruning techniques that may further reduce the size of rule models and
improve their accuracy. The proposed QCBA method was initially developed to
postprocess quantitative attributes in models generated by the Classification
based on associations (CBA) algorithm, but it can also be applied to the
results of other rule learning approaches. We demonstrate the effectiveness on
the postprocessing of models generated by five association rule classification
algorithms (CBA, CMAR, CPAR, IDS, SBRL) and two first-order logic rule learners
(FOIL2 and PRM). Benchmarks on 22 datasets from the UCI repository show smaller
size and the overall best predictive performance for FOIL2+QCBA compared to all
seven baselines. Postoptimised CBA models have a better predictive performance
compared to the state-of-the-art rule learner CORELS in this benchmark. The
article contains an ablation study for the individual postprocessing steps and
a scalability analysis on the KDD'99 Anomaly detection dataset.
Related papers
- Rule-based Data Selection for Large Language Models [9.886837013587124]
The quality of training data significantly impacts the performance of large language models (LLMs)
There are increasing studies using LLMs to rate and select data based on several human-crafted metrics (rules)
These conventional rule-based approaches often depend too heavily on human vectorss, lack effective metrics for assessing rules, and exhibit limited adaptability to new tasks.
arXiv Detail & Related papers (2024-10-07T03:13:06Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity.
Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks.
We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z) - Sparse Attention-Based Neural Networks for Code Classification [15.296053323327312]
We introduce an approach named the Sparse Attention-based neural network for Code Classification (SACC)
In the first step, source code undergoes syntax parsing and preprocessing.
The encoded sequences of subtrees are fed into a Transformer model that incorporates sparse attention mechanisms for the purpose of classification.
arXiv Detail & Related papers (2023-11-11T14:07:12Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Probabilistic Case-based Reasoning for Open-World Knowledge Graph
Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem.
In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs)
Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for
Supporting Active Learning [0.0]
This work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning.
The algorithm achieves promising results in terms of the percentage of correctly classified instances.
arXiv Detail & Related papers (2020-01-30T17:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.