Classification of HI Galaxy Profiles Using Unsupervised Learning and Convolutional Neural Networks: A Comparative Analysis and Methodological Cases of Studies
- URL: http://arxiv.org/abs/2501.11657v1
- Date: Mon, 20 Jan 2025 18:39:11 GMT
- Title: Classification of HI Galaxy Profiles Using Unsupervised Learning and Convolutional Neural Networks: A Comparative Analysis and Methodological Cases of Studies
- Authors: Gabriel Jaimes-Illanes, Manuel Parra-Royon, Laura Darriba-Pol, Javier Moldón, Amidou Sorgho, Susana Sánchez-Expósito, Julián Garrido-Sánchez, Lourdes Verdes-Montenegro,
- Abstract summary: Hydrogen, the most abundant element in the universe, is crucial for understanding galaxy formation and evolution.
With new radio instruments, the volume and complexity of data is increasing.
This work presents a framework that integrates Machine Learning techniques, combining unsupervised methods and CNNs.
- Score: 0.0
- License:
- Abstract: Hydrogen, the most abundant element in the universe, is crucial for understanding galaxy formation and evolution. The 21 cm neutral atomic hydrogen - HI spectral line maps the gas kinematics within galaxies, providing key insights into interactions, galactic structure, and star formation processes. With new radio instruments, the volume and complexity of data is increasing. To analyze and classify integrated HI spectral profiles in a efficient way, this work presents a framework that integrates Machine Learning techniques, combining unsupervised methods and CNNs. To this end, we apply our framework to a selected subsample of 318 spectral HI profiles of the CIG and 30.780 profiles from the Arecibo Legacy Fast ALFA Survey catalogue. Data pre-processing involved the Busyfit package and iterative fitting with polynomial, Gaussian, and double-Lorentzian models. Clustering methods, including K-means, spectral clustering, DBSCAN, and agglomerative clustering, were used for feature extraction and to bootstrap classification we applied K-NN, SVM, and Random Forest classifiers, optimizing accuracy with CNN. Additionally, we introduced a 2D model of the profiles to enhance classification by adding dimensionality to the data. Three 2D models were generated based on transformations and normalised versions to quantify the level of asymmetry. These methods were tested in a previous analytical classification study conducted by the Analysis of the Interstellar Medium in Isolated Galaxies group. This approach enhances classification accuracy and aims to establish a methodology that could be applied to data analysis in future surveys conducted with the Square Kilometre Array (SKA), currently under construction. All materials, code, and models have been made publicly available in an open-access repository, adhering to FAIR principles.
Related papers
- A Comprehensive Survey on Spectral Clustering with Graph Structure Learning [10.579153358536372]
Spectral clustering is a powerful technique for clustering high-dimensional data.
We explore various graph clustering techniques, including pairwise, anchor, and hypergraph-based methods.
We discuss multi-view clustering frameworks, examining their applications within one-step and two-step clustering processes.
arXiv Detail & Related papers (2025-01-23T12:06:32Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - Superpixel Graph Contrastive Clustering with Semantic-Invariant
Augmentations for Hyperspectral Images [64.72242126879503]
Hyperspectral images (HSI) clustering is an important but challenging task.
We first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spatial and spectral features of HSI.
We then design a superpixel graph contrastive clustering model to learn discriminative superpixel representations.
arXiv Detail & Related papers (2024-03-04T07:40:55Z) - Multiview Subspace Clustering of Hyperspectral Images based on Graph
Convolutional Networks [12.275530282665578]
This study proposes a multiview subspace clustering of hy-perspectral images (HSI) based on graph convolutional networks.
The model was evaluated on three popular HSI datasets, including Indian Pines, Pavia University, and Houston.
It achieved overall accuracies of 92.38%, 93.43%, and 83.82%, respectively, and significantly outperformed the state-of-the-art clustering methods.
arXiv Detail & Related papers (2024-03-03T10:19:18Z) - Contrastive Multi-view Subspace Clustering of Hyperspectral Images based
on Graph Convolutional Networks [14.978666092012856]
Subspace clustering is an effective approach for clustering hyperspectral images.
In this study, contrastive multi-view subspace clustering of HSI was proposed based on graph convolutional networks.
The proposed model effectively improves the clustering accuracy of HSI.
arXiv Detail & Related papers (2023-12-11T02:22:10Z) - Fast conformational clustering of extensive molecular dynamics
simulation data [19.444636864515726]
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long trajectories.
We combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (HDBSCAN)
With the help of four test systems we illustrate the capability and performance of this clustering workflow.
arXiv Detail & Related papers (2023-01-11T14:36:43Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet
Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets.
We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations.
We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z) - Detection of extragalactic Ultra-Compact Dwarfs and Globular Clusters
using Explainable AI techniques [1.3764085113103222]
Compact stellar systems such as Ultra-compact dwarfs (UCDs) and Globular Clusters (GCs) around galaxies are known to be the tracers of the merger events that have been forming these galaxies.
Here, we train a machine learning model to separate these objects from the foreground stars and background galaxies using the multi-wavelength imaging data of the Fornax galaxy cluster in 6 filters.
We are able to identify UCDs/GCs with a precision and a recall of >93 percent and provide relevances that reflect the importance of each feature dimension %(colors and angular sizes)
arXiv Detail & Related papers (2022-01-05T13:37:55Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.