Semantic search for 100M+ galaxy images using AI-generated captions
- URL: http://arxiv.org/abs/2512.11982v1
- Date: Fri, 12 Dec 2025 19:06:14 GMT
- Title: Semantic search for 100M+ galaxy images using AI-generated captions
- Authors: Nolan Koblischke, Liam Parker, Francois Lanusse, Irina Espejo Morales, Jo Bovy, Shirley Ho,
- Abstract summary: We develop a pipeline to create a semantic search engine from unlabeled image data.<n>Our model, AION-Search, achieves state-of-the-art zero-shot performance on finding rare phenomena.<n>For the first time, AION-Search enables flexible semantic search scalable to 140 million galaxy images.
- Score: 0.9982481289515875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding scientifically interesting phenomena through slow, manual labeling campaigns severely limits our ability to explore the billions of galaxy images produced by telescopes. In this work, we develop a pipeline to create a semantic search engine from completely unlabeled image data. Our method leverages Vision-Language Models (VLMs) to generate descriptions for galaxy images, then contrastively aligns a pre-trained multimodal astronomy foundation model with these embedded descriptions to produce searchable embeddings at scale. We find that current VLMs provide descriptions that are sufficiently informative to train a semantic search model that outperforms direct image similarity search. Our model, AION-Search, achieves state-of-the-art zero-shot performance on finding rare phenomena despite training on randomly selected images with no deliberate curation for rare cases. Furthermore, we introduce a VLM-based re-ranking method that nearly doubles the recall for our most challenging targets in the top-100 results. For the first time, AION-Search enables flexible semantic search scalable to 140 million galaxy images, enabling discovery from previously infeasible searches. More broadly, our work provides an approach for making large, unlabeled scientific image archives semantically searchable, expanding data exploration capabilities in fields from Earth observation to microscopy. The code, data, and app are publicly available at https://github.com/NolanKoblischke/AION-Search
Related papers
- MedROV: Towards Real-Time Open-Vocabulary Detection Across Diverse Medical Imaging Modalities [89.81463562506637]
We introduce MedROV, the first Real-time Open Vocabulary detection model for medical imaging.<n>By leveraging contrastive learning and cross-modal representations, MedROV effectively detects both known and novel structures.
arXiv Detail & Related papers (2025-11-25T18:59:53Z) - Galaxy image simplification using Generative AI [0.0]
We introduce a new approach to galaxy image analysis that is based on generative AI.<n>The method simplifies the galaxy images and automatically converts them into a skeletonized" form.<n>We demonstrate the method by applying it to galaxy images acquired by the DESI Legacy Survey.
arXiv Detail & Related papers (2025-07-15T19:48:09Z) - Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation [4.3933321767775135]
We propose a conditional diffusion model to synthesize realistic galaxy images for augmenting machine learning data.<n>We show that our model generates diverse, high-fidelity galaxy images closely adhere to the specified morphological feature conditions.<n>This model enables generative extrapolation to project well-annotated data into unseen domains and advancing rare object detection.
arXiv Detail & Related papers (2025-06-19T11:44:09Z) - OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence [51.0456395687016]
multimodal large language models (LLMs) have opened new frontiers in artificial intelligence.<n>We propose a MLLM (OmniGeo) tailored to geospatial applications.<n>By combining the strengths of natural language understanding and spatial reasoning, our model enhances the ability of instruction following and the accuracy of GeoAI systems.
arXiv Detail & Related papers (2025-03-20T16:45:48Z) - Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection.
CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z) - A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model [14.609681101463334]
We present a framework for general analysis of galaxy images based on a large vision model (LVM) plus downstream tasks (DST)
Considering the low signal-to-noise ratio of galaxy images, we have incorporated a Human-in-the-loop (HITL) module into our large vision model.
For object detection, trained by 1000 data points, our DST upon the LVM achieves an accuracy of 96.7%, while ResNet50 plus Mask R-CNN gives an accuracy of 93.1%.
arXiv Detail & Related papers (2024-05-17T16:29:27Z) - Astroformer: More Data Might not be all you need for Classification [0.0]
We introduce Astroformer, a method to learn from less amount of data.
Our approach sets a new state-of-the-art on predicting galaxy morphologies from images on the Galaxy10 DECals dataset.
arXiv Detail & Related papers (2023-04-03T09:38:05Z) - Self-supervised similarity search for large scientific datasets [0.0]
We present the use of self-supervised learning to explore and exploit large unlabeled datasets.
We first train a self-supervised model to distil low-dimensional representations that are robust to symmetries, uncertainties, and noise in each image.
We then use the representations to construct and publicly release an interactive semantic similarity search tool.
arXiv Detail & Related papers (2021-10-25T18:00:00Z) - Large-Scale Unsupervised Object Discovery [80.60458324771571]
unsupervised object discovery (UOD) do not scale up to large datasets without approximations which compromise their performance.
We propose a novel formulation of UOD as a ranking problem, amenable to the arsenal of distributed methods available for eigenvalue problems and link analysis.
arXiv Detail & Related papers (2021-06-12T00:29:49Z) - Morphological classification of astronomical images with limited
labelling [0.0]
We propose an effective semi-supervised approach for galaxy morphology classification task, based on active learning of adversarial autoencoder (AAE) model.
For a binary classification problem (top level question of Galaxy Zoo 2 decision tree) we achieved accuracy 93.1% on the test part with only 0.86 millions markup actions.
Our best model with additional markup accuracy of 95.5%.
arXiv Detail & Related papers (2021-04-27T19:26:27Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - Object Detection in Aerial Images: A Large-Scale Benchmark and
Challenges [124.48654341780431]
We present a large-scale dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images.
We build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated.
arXiv Detail & Related papers (2021-02-24T11:20:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.