Identifying outliers in astronomical images with unsupervised machine
learning
- URL: http://arxiv.org/abs/2205.09760v1
- Date: Thu, 19 May 2022 09:58:48 GMT
- Title: Identifying outliers in astronomical images with unsupervised machine
learning
- Authors: Yang Han and Zhiqiang Zou and Nan Li and Yanli Chen
- Abstract summary: Unpredictable astronomical outliers constantly lead to the discovery of genuinely unforeseen knowledge in astronomy.
It is a severe challenge to mine rare and unexpected targets from enormous data with human inspection.
We adopt unsupervised machine learning approaches to identify outliers in the data of galaxy images.
- Score: 4.469071901315176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Astronomical outliers, such as unusual, rare or unknown types of astronomical
objects or phenomena, constantly lead to the discovery of genuinely unforeseen
knowledge in astronomy. More unpredictable outliers will be uncovered in
principle with the increment of the coverage and quality of upcoming survey
data. However, it is a severe challenge to mine rare and unexpected targets
from enormous data with human inspection due to a significant workload.
Supervised learning is also unsuitable for this purpose since designing proper
training sets for unanticipated signals is unworkable. Motivated by these
challenges, we adopt unsupervised machine learning approaches to identify
outliers in the data of galaxy images to explore the paths for detecting
astronomical outliers. For comparison, we construct three methods, which are
built upon the k-nearest neighbors (KNN), Convolutional Auto-Encoder (CAE)+
KNN, and CAE + KNN + Attention Mechanism (attCAE KNN) separately. Testing sets
are created based on the Galaxy Zoo image data published online to evaluate the
performance of the above methods. Results show that attCAE KNN achieves the
best recall (78%), which is 53% higher than the classical KNN method and 22%
higher than CAE+KNN. The efficiency of attCAE KNN (10 minutes) is also superior
to KNN (4 hours) and equal to CAE+KNN(10 minutes) for accomplishing the same
task. Thus, we believe it is feasible to detect astronomical outliers in the
data of galaxy images in an unsupervised manner. Next, we will apply attCAE KNN
to available survey datasets to assess its applicability and reliability.
Related papers
- Information Modified K-Nearest Neighbor [4.916646834691489]
K-Nearest Neighbors (KNN) is the classification of samples based on the majority through their nearest neighbors.
Many KNN methodologies introduce complex algorithms that do not significantly outperform the traditional KNN.
We present a proposed information-modified KNN (IMKNN) to improve the performance of the KNN algorithm.
We conduct experiments on 12 widely-used datasets, achieving 11.05%, 12.42%, and 12.07% in accuracy, precision, and recall performance, respectively.
arXiv Detail & Related papers (2023-12-04T16:10:34Z) - Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology
Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA.
It can be applied to datasets with different types of class overlap.
For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z) - ODNet: A Convolutional Neural Network for Asteroid Occultation Detection [0.36700088931938835]
We propose to build an algorithm that will use a Convolutional Neural Network (CNN) and observations from the Unistellar network to reliably detect asteroid occultations.
The algorithm is sufficiently fast and robust so we can envision incorporating onboard the eVscopes to deliver real-time results.
arXiv Detail & Related papers (2022-10-28T23:53:09Z) - Systematic biases when using deep neural networks for annotating large
catalogs of astronomical images [0.0]
We show that for basic classification of elliptical and spiral galaxies, the sky location of the galaxies used for training affects the behavior of the algorithm.
That bias exhibits itself in the form of cosmological-scale anisotropy in the distribution of basic galaxy morphology.
arXiv Detail & Related papers (2022-01-10T01:51:14Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - KNN-enhanced Deep Learning Against Noisy Labels [4.765948508271371]
Supervised learning on Deep Neural Networks (DNNs) is data hungry.
In this work, we propose to apply deep KNN for label cleanup.
We iteratively train the neural network and update labels to simultaneously proceed towards higher label recovery rate and better classification performance.
arXiv Detail & Related papers (2020-12-08T05:21:29Z) - DeepShadows: Separating Low Surface Brightness Galaxies from Artifacts
using Deep Learning [70.80563014913676]
We investigate the use of convolutional neural networks (CNNs) for the problem of separating low-surface-brightness galaxies from artifacts in survey images.
We show that CNNs offer a very promising path in the quest to study the low-surface-brightness universe.
arXiv Detail & Related papers (2020-11-24T22:51:08Z) - Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged
Fraudsters [78.53851936180348]
We introduce two types of camouflages based on recent empirical studies, i.e., the feature camouflage and the relation camouflage.
Existing GNNs have not addressed these two camouflages, which results in their poor performance in fraud detection problems.
We propose a new model named CAmouflage-REsistant GNN (CARE-GNN) to enhance the GNN aggregation process with three unique modules against camouflages.
arXiv Detail & Related papers (2020-08-19T22:33:12Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z) - Advanced kNN: A Mature Machine Learning Series [2.7729939137633877]
k-nearest neighbour (kNN) is one of the most prominent, simple and basic algorithm used in machine learning and data mining.
The purpose of this paper is to suggest an Advanced kNN (A-kNN) algorithm that will be able to classify an instance as unknown.
arXiv Detail & Related papers (2020-03-01T06:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.