AstroLLaMA: Towards Specialized Foundation Models in Astronomy
- URL: http://arxiv.org/abs/2309.06126v1
- Date: Tue, 12 Sep 2023 11:02:27 GMT
- Title: AstroLLaMA: Towards Specialized Foundation Models in Astronomy
- Authors: Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciuc\u{a}, Charlie O'Neill,
Ze-Chang Sun, Maja Jab{\l}o\'nska, Sandor Kruk, Ernest Perkowski, Jack
Miller, Jason Li, Josh Peek, Kartheik Iyer, Tomasz R\'o\.za\'nski, Pranav
Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodr\'iguez M\'endez,
Thang Bui, Alyssa Goodman, Alberto Accomazzi, Jill Naiman, Jesse Cranney,
Kevin Schawinski, UniverseTBD
- Abstract summary: We introduce AstroLLaMA, a 7-billion- parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv.
Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models.
Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
- Score: 1.1694367694169385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models excel in many human-language tasks but often falter in
highly specialized domains like scholarly astronomy. To bridge this gap, we
introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using
over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal
language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2,
showing marked domain adaptation. Our model generates more insightful and
scientifically relevant text completions and embedding extraction than
state-of-the-arts foundation models despite having significantly fewer
parameters. AstroLLaMA serves as a robust, domain-specific model with broad
fine-tuning potential. Its public release aims to spur astronomy-focused
research, including automatic paper summarization and conversational agent
development.
Related papers
- AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy [4.729846733874557]
This study aims to quantitatively assess specialized LLMs in astronomy.
We find that the previously released AstroLLaMA series, based on LLaMA-2-7B, underperforms compared to the base model.
Despite the observed catastrophic forgetting in smaller models, our results indicate that continual pretraining on the 70B model can yield significant improvements.
arXiv Detail & Related papers (2024-09-29T16:02:22Z) - Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian [75.94354349994576]
This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in specialized contexts.
Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models.
The results indicate that while further pre-trained models may show diminished robustness in general knowledge, they exhibit superior adaptability for domain-specific tasks, even in a zero-shot setting.
arXiv Detail & Related papers (2024-07-30T08:50:16Z) - AstroMLab 1: Who Wins Astronomy Jeopardy!? [4.162245706139047]
This dataset comprises 4,425 multiple-choice questions curated from the Annual Review of Astronomy and Astrophysics.
Claude-3.5-Sonnet outperforms competitors by up to 4.6 percentage points, achieving 85.0% accuracy.
Open-weights models have rapidly improved, with LLaMA-3-70b (80.6%) and Qwen-2-72b (77.7%) now competing with some of the best proprietary models.
arXiv Detail & Related papers (2024-07-15T19:28:14Z) - At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models [0.0]
Vision-Language multimodal Models (VLMs) offer the possibility for zero-shot classification in astronomy.
We investigate two models, GPT-4o and LLaVA-NeXT, for zero-shot classification of low-surface brightness galaxies and artifacts.
We show that with natural language prompts these models achieved significant accuracy (above 80 percent typically) without additional training/fine tuning.
arXiv Detail & Related papers (2024-06-24T18:17:54Z) - SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence.
We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z) - Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator? [0.0]
We experiment with an approach using predictions from a fine-tuned LLM model to aid non-domain experts in annotating scientific entities within astronomy literature.
Our results reveal moderate agreement between a domain expert and the LLM-assisted non-experts, as well as fair agreement between the domain expert and the LLM model's predictions.
The resultant dataset, containing 5,000 annotated astronomy article titles, is made publicly available.
arXiv Detail & Related papers (2024-05-04T08:04:39Z) - Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models [93.92762966380793]
Large language models (LLMs) strive to achieve high performance across all three domains simultaneously.
In this paper, we propose to fuse models that are already highly-specialized directly.
The proposed fusing framework, UltraFuser, consists of three distinct specialists that are already sufficiently trained on language, coding, and mathematics.
arXiv Detail & Related papers (2024-03-13T06:18:48Z) - AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse
Datasets [7.53209156977206]
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training.
We achieve notable improvements in specialized topic comprehension using a curated set of astronomy corpora.
We present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use.
arXiv Detail & Related papers (2024-01-03T04:47:02Z) - SeaLLMs -- Large Language Models for Southeast Asia [76.50157503379086]
We introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian (SEA) languages.
SeaLLMs are built upon the Llama-2 model and further advanced through continued pre-training with an extended vocabulary, specialized instruction and alignment tuning.
Our comprehensive evaluation demonstrates that SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities.
arXiv Detail & Related papers (2023-12-01T17:17:56Z) - Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning [52.29522018586365]
We study structured pruning as an effective means to develop smaller LLMs from pre-trained, larger models.
Our approach employs two key techniques: (1) targeted structured pruning, which prunes a larger model to a specified target shape by removing layers, heads, and intermediate and hidden dimensions in an end-to-end manner, and (2) dynamic batch loading, which dynamically updates the composition of sampled data in each training batch based on varying losses across different domains.
arXiv Detail & Related papers (2023-10-10T15:13:30Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.