HLoOP -- Hyperbolic 2-space Local Outlier Probabilities
- URL: http://arxiv.org/abs/2312.03895v1
- Date: Wed, 6 Dec 2023 20:38:39 GMT
- Title: HLoOP -- Hyperbolic 2-space Local Outlier Probabilities
- Authors: Cl\'emence Allietta, Jean-Philippe Condomines, Jean-Yves Tourneret,
Emmanuel Lochin
- Abstract summary: This paper introduces a simple framework to detect local outliers for datasets grounded in hyperbolic 2-spaces.
The developed HLoOP combines the idea of finding nearest neighbors, density-based outlier scoring with a probabilistic, statistically oriented approach.
The HLoOP algorithm is tested on the WordNet dataset yielding promising results.
- Score: 4.030910640265943
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Hyperbolic geometry has recently garnered considerable attention in machine
learning due to its capacity to embed hierarchical graph structures with low
distortions for further downstream processing. This paper introduces a simple
framework to detect local outliers for datasets grounded in hyperbolic 2-space
referred to as HLoOP (Hyperbolic Local Outlier Probability). Within a Euclidean
space, well-known techniques for local outlier detection are based on the Local
Outlier Factor (LOF) and its variant, the LoOP (Local Outlier Probability),
which incorporates probabilistic concepts to model the outlier level of a data
vector. The developed HLoOP combines the idea of finding nearest neighbors,
density-based outlier scoring with a probabilistic, statistically oriented
approach. Therefore, the method consists in computing the Riemmanian distance
of a data point to its nearest neighbors following a Gaussian probability
density function expressed in a hyperbolic space. This is achieved by defining
a Gaussian cumulative distribution in this space. The HLoOP algorithm is tested
on the WordNet dataset yielding promising results. Code and data will be made
available on request for reproductibility.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Learning conditional distributions on continuous spaces [0.0]
We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes.
We employ two distinct clustering schemes: one based on a fixed-radius ball and the other on nearest neighbors.
We propose to incorporate the nearest neighbors method into neural network training, as our empirical analysis indicates it has better performance in practice.
arXiv Detail & Related papers (2024-06-13T17:53:47Z) - Beyond the Known: Adversarial Autoencoders in Novelty Detection [2.7486022583843233]
In novelty detection, the goal is to decide if a new data point should be categorized as an inlier or an outlier.
We use a similar framework but with a lightweight deep network, and we adopt a probabilistic score with reconstruction error.
Our results indicate that our approach is effective at learning the target class, and it outperforms recent state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2024-04-06T00:04:19Z) - Hyperspectral Target Detection Based on Low-Rank Background Subspace
Learning and Graph Laplacian Regularization [2.9626402880497267]
Hyperspectral target detection is good at finding dim and small objects based on spectral characteristics.
Existing representation-based methods are hindered by the problem of the unknown background dictionary.
This paper proposes an efficient optimizing approach based on low-rank representation (LRR) and graph Laplacian regularization (GLR)
arXiv Detail & Related papers (2023-06-01T13:51:08Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - Tensor Laplacian Regularized Low-Rank Representation for Non-uniformly
Distributed Data Subspace Clustering [2.578242050187029]
Low-Rank Representation (LRR) suffers from discarding the locality information of data points in subspace clustering.
We propose a hypergraph model to facilitate having a variable number of adjacent nodes and incorporating the locality information of the data.
Experiments on artificial and real datasets demonstrate the higher accuracy and precision of the proposed method.
arXiv Detail & Related papers (2021-03-06T08:22:24Z) - Improving Generative Adversarial Networks with Local Coordinate Coding [150.24880482480455]
Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution.
In practice, semantic information might be represented by some latent distribution learned from data.
We propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
arXiv Detail & Related papers (2020-07-28T09:17:50Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Outlier Detection Using a Novel method: Quantum Clustering [24.11904406960212]
We propose a new assumption in outlier detection: Normal data instances are commonly located in the area that there is hardly any fluctuation on data density.
We apply a novel density-based approach to unsupervised outlier detection.
arXiv Detail & Related papers (2020-06-08T17:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.