BikNN: Anomaly Estimation in Bilateral Domains with k-Nearest Neighbors
- URL: http://arxiv.org/abs/2105.05037v1
- Date: Tue, 11 May 2021 13:45:29 GMT
- Title: BikNN: Anomaly Estimation in Bilateral Domains with k-Nearest Neighbors
- Authors: Zhongping Ji
- Abstract summary: A novel framework for anomaly estimation is proposed in this paper.
We attempt to estimate the degree of anomaly in both spatial and density domains.
Our method takes into account both the spatial domain and the density domain and can be adapted to different datasets by adjusting a few parameters manually.
- Score: 1.2183405753834562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, a novel framework for anomaly estimation is proposed. The
basic idea behind our method is to reduce the data into a two-dimensional space
and then rank each data point in the reduced space. We attempt to estimate the
degree of anomaly in both spatial and density domains. Specifically, we
transform the data points into a density space and measure the distances in
density domain between each point and its k-Nearest Neighbors in spatial
domain. Then, an anomaly coordinate system is built by collecting two
unilateral anomalies from k-nearest neighbors of each point. Further more, we
introduce two schemes to model their correlation and combine them to get the
final anomaly score. Experiments performed on the synthetic and real world
datasets demonstrate that the proposed method performs well and achieve highest
average performance. We also show that the proposed method can provide
visualization and classification of the anomalies in a simple manner. Due to
the complexity of the anomaly, none of the existing methods can perform best on
all benchmark datasets. Our method takes into account both the spatial domain
and the density domain and can be adapted to different datasets by adjusting a
few parameters manually.
Related papers
- Density based Spatial Clustering of Lines via Probabilistic Generation of Neighbourhood [0.0]
In this paper, we design a clustering algorithm that generates a customised neighbourhood for a line of a fixed volume.
This algorithm is not sensitive to the outliers and can effectively identify the noise in the data using a cardinality parameter.
One of the pivotal applications of this algorithm is clustering data points in $mathbbRn$ with missing entries.
arXiv Detail & Related papers (2024-10-03T08:17:11Z) - Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology
Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA.
It can be applied to datasets with different types of class overlap.
For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z) - Overlap-guided Gaussian Mixture Models for Point Cloud Registration [61.250516170418784]
Probabilistic 3D point cloud registration methods have shown competitive performance in overcoming noise, outliers, and density variations.
This paper proposes a novel overlap-guided probabilistic registration approach that computes the optimal transformation from matched Gaussian Mixture Model (GMM) parameters.
arXiv Detail & Related papers (2022-10-17T08:02:33Z) - Unsupervised Manifold Alignment with Joint Multidimensional Scaling [4.683612295430957]
We introduce Joint Multidimensional Scaling, which maps datasets from two different domains to a common low-dimensional Euclidean space.
Our approach integrates Multidimensional Scaling (MDS) and Wasserstein Procrustes analysis into a joint optimization problem.
We demonstrate the effectiveness of our approach in several applications, including joint visualization of two datasets, unsupervised heterogeneous domain adaptation, graph matching, and protein structure alignment.
arXiv Detail & Related papers (2022-07-06T21:02:42Z) - Information Entropy Initialized Concrete Autoencoder for Optimal Sensor
Placement and Reconstruction of Geophysical Fields [58.720142291102135]
We propose a new approach to the optimal placement of sensors for reconstructing geophysical fields from sparse measurements.
We demonstrate our method on the two examples: (a) temperature and (b) salinity fields around the Barents Sea and the Svalbard group of islands.
We find out that the obtained optimal sensor locations have clear physical interpretation and correspond to the boundaries between sea currents.
arXiv Detail & Related papers (2022-06-28T12:43:38Z) - Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with
Coherent Embeddings [1.7188280334580195]
This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved.
The proposed algorithm has the same complexity as the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces.
arXiv Detail & Related papers (2021-09-22T06:45:37Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies.
We present a number of applications where the use of multiple geometric models improves accuracy.
These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z) - Tensor Laplacian Regularized Low-Rank Representation for Non-uniformly
Distributed Data Subspace Clustering [2.578242050187029]
Low-Rank Representation (LRR) suffers from discarding the locality information of data points in subspace clustering.
We propose a hypergraph model to facilitate having a variable number of adjacent nodes and incorporating the locality information of the data.
Experiments on artificial and real datasets demonstrate the higher accuracy and precision of the proposed method.
arXiv Detail & Related papers (2021-03-06T08:22:24Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Stochastic Sparse Subspace Clustering [20.30051592270384]
State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points.
We introduce dropout to address the issue of over-segmentation, which is based on randomly dropping out data points.
This leads to a scalable and flexible sparse subspace clustering approach, termed Sparse Subspace Clustering.
arXiv Detail & Related papers (2020-05-04T13:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.