Generation of Consistent Sets of Multi-Label Classification Rules with a
Multi-Objective Evolutionary Algorithm
- URL: http://arxiv.org/abs/2003.12526v1
- Date: Fri, 27 Mar 2020 16:43:10 GMT
- Title: Generation of Consistent Sets of Multi-Label Classification Rules with a
Multi-Objective Evolutionary Algorithm
- Authors: Thiago Zafalon Miranda, Diorge Brognara Sardinha, M\'arcio Porto
Basgalupp, Yaochu Jin, Ricardo Cerri
- Abstract summary: We propose a multi-objective evolutionary algorithm that generates multiple rule-based multi-label classification models.
Our algorithm generates models based on sets (unordered collections) of rules, increasing interpretability.
Also, by employing a conflict avoidance algorithm during the rule-creation, every rule within a given model is guaranteed to be consistent with every other rule in the same model.
- Score: 11.25469393912791
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label classification consists in classifying an instance into two or
more classes simultaneously. It is a very challenging task present in many
real-world applications, such as classification of biology, image, video,
audio, and text. Recently, the interest in interpretable classification models
has grown, partially as a consequence of regulations such as the General Data
Protection Regulation. In this context, we propose a multi-objective
evolutionary algorithm that generates multiple rule-based multi-label
classification models, allowing users to choose among models that offer
different compromises between predictive power and interpretability. An
important contribution of this work is that different from most algorithms,
which usually generate models based on lists (ordered collections) of rules,
our algorithm generates models based on sets (unordered collections) of rules,
increasing interpretability. Also, by employing a conflict avoidance algorithm
during the rule-creation, every rule within a given model is guaranteed to be
consistent with every other rule in the same model. Thus, no conflict
resolution strategy is required, evolving simpler models. We conducted
experiments on synthetic and real-world datasets and compared our results with
state-of-the-art algorithms in terms of predictive performance (F-Score) and
interpretability (model size), and demonstrate that our best models had
comparable F-Score and smaller model sizes.
Related papers
- Ensemble Methods for Sequence Classification with Hidden Markov Models [8.241486511994202]
We present a lightweight approach to sequence classification using Ensemble Methods for Hidden Markov Models (HMMs)
HMMs offer significant advantages in scenarios with imbalanced or smaller datasets due to their simplicity, interpretability, and efficiency.
Our ensemble-based scoring method enables the comparison of sequences of any length and improves performance on imbalanced datasets.
arXiv Detail & Related papers (2024-09-11T20:59:32Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Random Models for Fuzzy Clustering Similarity Measures [0.0]
The Adjusted Rand Index (ARI) is a widely used method for comparing hard clusterings.
We propose a single framework for computing the ARI with three random models that are intuitive and explainable for both hard and fuzzy clusterings.
arXiv Detail & Related papers (2023-12-16T00:07:04Z) - Statistical Comparisons of Classifiers by Generalized Stochastic
Dominance [0.0]
There is still no consensus on how to compare classifiers over multiple data sets with respect to several criteria.
In this paper, we add a fresh view to the vivid debate by adopting recent developments in decision theory.
We show that our framework ranks classifiers by a generalized concept of dominance, which powerfully circumvents the cumbersome, and often even self-contradictory, reliance on aggregates.
arXiv Detail & Related papers (2022-09-05T09:28:15Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - SOAR: Simultaneous Or of And Rules for Classification of Positive &
Negative Classes [0.0]
We present a novel and complete taxonomy of classifications that clearly capture and quantify the inherent ambiguity in noisy binary classifications in the real world.
We show that this approach leads to a more granular formulation of the likelihood model and a simulated-annealing based optimization achieves classification performance competitive with comparable techniques.
arXiv Detail & Related papers (2020-08-25T20:00:27Z) - Rewriting a Deep Generative Model [56.91974064348137]
We introduce a new problem setting: manipulation of specific rules encoded by a deep generative model.
We propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory.
We present a user interface to enable users to interactively change the rules of a generative model to achieve desired effects.
arXiv Detail & Related papers (2020-07-30T17:58:16Z) - Diverse Rule Sets [20.170305081348328]
Rule-based systems are experiencing a renaissance owing to their intuitive if-then representation.
We propose a novel approach of inferring diverse rule sets, by optimizing small overlap among decision rules.
We then devise an efficient randomized algorithm, which samples rules that are highly discriminative and have small overlap.
arXiv Detail & Related papers (2020-06-17T14:15:25Z) - Explainable Matrix -- Visualization for Global and Local
Interpretability of Random Forest Classification Ensembles [78.6363825307044]
We propose Explainable Matrix (ExMatrix), a novel visualization method for Random Forest (RF) interpretability.
It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates.
ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
arXiv Detail & Related papers (2020-05-08T21:03:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.