Outlier detection using flexible categorisation and interrogative
agendas
- URL: http://arxiv.org/abs/2312.12010v2
- Date: Wed, 20 Dec 2023 10:51:52 GMT
- Title: Outlier detection using flexible categorisation and interrogative
agendas
- Authors: Marcel Boersma, Krishna Manoorkar, Alessandra Palmigiano, Mattia
Panettiere, Apostolos Tzimoulis, Nachoem Wijnberg
- Abstract summary: Different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them.
We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas.
We then present a supervised meta-learning algorithm to learn suitable agendas for categorization as sets of features with different weights or masses.
- Score: 42.321011564731585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Categorization is one of the basic tasks in machine learning and data
analysis. Building on formal concept analysis (FCA), the starting point of the
present work is that different ways to categorize a given set of objects exist,
which depend on the choice of the sets of features used to classify them, and
different such sets of features may yield better or worse categorizations,
relative to the task at hand. In their turn, the (a priori) choice of a
particular set of features over another might be subjective and express a
certain epistemic stance (e.g. interests, relevance, preferences) of an agent
or a group of agents, namely, their interrogative agenda. In the present paper,
we represent interrogative agendas as sets of features, and explore and compare
different ways to categorize objects w.r.t. different sets of features
(agendas). We first develop a simple unsupervised FCA-based algorithm for
outlier detection which uses categorizations arising from different agendas. We
then present a supervised meta-learning algorithm to learn suitable (fuzzy)
agendas for categorization as sets of features with different weights or
masses. We combine this meta-learning algorithm with the unsupervised outlier
detection algorithm to obtain a supervised outlier detection algorithm. We show
that these algorithms perform at par with commonly used algorithms for outlier
detection on commonly used datasets in outlier detection. These algorithms
provide both local and global explanations of their results.
Related papers
- A Rapid Review of Clustering Algorithms [5.46715422237599]
Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data.
They play an important role in today's life, such as in marketing and e-commerce, healthcare, data organization and analysis, and social media.
We analyzed existing clustering algorithms and classify mainstream algorithms across five different dimensions.
arXiv Detail & Related papers (2024-01-14T23:19:53Z) - Class-Specific Variational Auto-Encoder for Content-Based Image
Retrieval [95.42181254494287]
We propose a regularized loss for Variational Auto-Encoders (VAEs) forcing the model to focus on a given class of interest.
As a result, the model learns to discriminate the data belonging to the class of interest from any other possibility.
Experimental results show that the proposed method outperforms its competition in both in-domain and out-of-domain retrieval problems.
arXiv Detail & Related papers (2023-04-23T19:51:25Z) - A Meta-Learning Algorithm for Interrogative Agendas [3.0969191504482247]
We focus on formal concept analysis (FCA), a standard knowledge representation formalism, to express interrogative agendas.
Several FCA-based algorithms have already been in use for standard machine learning tasks such as classification and outlier detection.
In this paper, we propose a meta-learning algorithm to construct a good interrogative agenda explaining the data.
arXiv Detail & Related papers (2023-01-04T22:09:36Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Out-of-Category Document Identification Using Target-Category Names as
Weak Supervision [64.671654559798]
Out-of-category detection aims to distinguish documents according to their semantic relevance to the inlier (or target) categories.
We present an out-of-category detection framework, which effectively measures how confidently each document belongs to one of the target categories.
arXiv Detail & Related papers (2021-11-24T21:01:25Z) - A Framework for Multi-View Classification of Features [6.660458629649826]
In solving the data classification problems, when the feature set is too large, typical approaches will not be able to solve the problem.
In this research, an innovative framework for multi-view ensemble classification, inspired by the problem of object recognition in the multiple views theory of humans, is proposed.
arXiv Detail & Related papers (2021-08-02T16:27:43Z) - A review of systematic selection of clustering algorithms and their
evaluation [0.0]
This paper aims to identify a systematic selection logic for clustering algorithms and corresponding validation concepts.
The goal is to enable potential users to choose an algorithm that fits best to their needs and the properties of their underlying data clustering problem.
arXiv Detail & Related papers (2021-06-24T07:01:46Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.