Related papers: Discovering Galaxy Features via Dataset Distillation

Discovering Galaxy Features via Dataset Distillation

URL: http://arxiv.org/abs/2311.17967v1
Date: Wed, 29 Nov 2023 12:39:31 GMT
Title: Discovering Galaxy Features via Dataset Distillation
Authors: Haowen Guan, Xuan Zhao, Zishi Wang, Zhiyang Li, and Julia Kempe
Abstract summary: In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity. Here, we apply this idea to the notoriously difficult task of galaxy classification. We present a novel way to summarize and visualize prototypical galaxy morphology through the lens of neural networks.
Score: 7.121183597915665
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity. Moreover, it is likely that NNs leverage underlying features that might differ from those humans perceive to classify. Can we "reverse-engineer" pertinent features to enhance our scientific understanding? Here, we apply this idea to the notoriously difficult task of galaxy classification: NNs have reached high performance for this task, but what does a neural net (NN) "see" when it classifies galaxies? Are there morphological features that the human eye might overlook that could help with the task and provide new insights? Can we visualize tracers of early evolution, or additionally incorporated spectral data? We present a novel way to summarize and visualize galaxy morphology through the lens of neural networks, leveraging Dataset Distillation, a recent deep-learning methodology with the primary objective to distill knowledge from a large dataset and condense it into a compact synthetic dataset, such that a model trained on this synthetic dataset achieves performance comparable to a model trained on the full dataset. We curate a class-balanced, medium-size high-confidence version of the Galaxy Zoo 2 dataset, and proceed with dataset distillation from our accurate NN-classifier to create synthesized prototypical images of galaxy morphological features, demonstrating its effectiveness. Of independent interest, we introduce a self-adaptive version of the state-of-the-art Matching Trajectory algorithm to automate the distillation process, and show enhanced performance on computer vision benchmarks.

Related papers

Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation [4.3933321767775135]
We propose a conditional diffusion model to synthesize realistic galaxy images for augmenting machine learning data.<n>We show that our model generates diverse, high-fidelity galaxy images closely adhere to the specified morphological feature conditions.<n>This model enables generative extrapolation to project well-annotated data into unseen domains and advancing rare object detection.
arXiv Detail & Related papers (2025-06-19T11:44:09Z)
Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Insights on Galaxy Evolution from Interpretable Sparse Feature Networks [0.0]
We present a novel neural network architecture called a Sparse Feature Network (SFNet) SFNets produce interpretable features that can be linearly combined in order to estimate galaxy properties like optical emission line ratios or gas-phase metallicity. We find that SFNets do not sacrifice accuracy in order to gain interpretability, and that they perform comparably well to cutting-edge models on astronomical machine learning tasks.
arXiv Detail & Related papers (2024-12-30T19:00:00Z)
A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model [14.609681101463334]
We present a framework for general analysis of galaxy images based on a large vision model (LVM) plus downstream tasks (DST) Considering the low signal-to-noise ratio of galaxy images, we have incorporated a Human-in-the-loop (HITL) module into our large vision model. For object detection, trained by 1000 data points, our DST upon the LVM achieves an accuracy of 96.7%, while ResNet50 plus Mask R-CNN gives an accuracy of 93.1%.
arXiv Detail & Related papers (2024-05-17T16:29:27Z)
Unveiling the Unseen: Identifiable Clusters in Trained Depthwise Convolutional Kernels [56.69755544814834]
Recent advances in depthwise-separable convolutional neural networks (DS-CNNs) have led to novel architectures. This paper reveals another striking property of DS-CNN architectures: discernible and explainable patterns emerge in their trained depthwise convolutional kernels in all layers.
arXiv Detail & Related papers (2024-01-25T19:05:53Z)
A Generative Self-Supervised Framework using Functional Connectivity in fMRI Data [15.211387244155725]
Deep neural networks trained on Functional Connectivity (FC) networks extracted from functional Magnetic Resonance Imaging (fMRI) data have gained popularity. Recent research on the application of Graph Neural Network (GNN) to FC suggests that exploiting the time-varying properties of the FC could significantly improve the accuracy and interpretability of the model prediction. High cost of acquiring high-quality fMRI data and corresponding labels poses a hurdle to their application in real-world settings. We propose a generative SSL approach that is tailored to effectively harnesstemporal information within dynamic FC.
arXiv Detail & Related papers (2023-12-04T16:14:43Z)
Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets [40.19690479537335]
We show that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information.
arXiv Detail & Related papers (2023-11-02T20:40:21Z)
Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA. It can be applied to datasets with different types of class overlap. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Simplifying approach to Node Classification in Graph Neural Networks [7.057970273958933]
We decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance. We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model. We present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models.
arXiv Detail & Related papers (2021-11-12T14:53:22Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Evolutionary Architecture Search for Graph Neural Networks [23.691915813153496]
We propose a novel AutoML framework through the evolution of individual models in a large Graph Neural Networks (GNN) architecture space. To the best of our knowledge, this is the first work to introduce and evaluate evolutionary architecture search for GNN models.
arXiv Detail & Related papers (2020-09-21T22:11:53Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.