Related papers: AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

URL: http://arxiv.org/abs/2401.01916v2
Date: Fri, 5 Jan 2024 07:46:32 GMT
Title: AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Authors: Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciuc\u{a} for UniverseTBD
Abstract summary: We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. We achieve notable improvements in specialized topic comprehension using a curated set of astronomy corpora. We present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use.
Score: 7.53209156977206
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at https://huggingface.co/universeTBD, providing the first open-source conversational AI tool tailored for the astronomy community.

Related papers

AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy [6.247581175023764]
We introduce AstroMMBench, the first comprehensive benchmark to evaluate multimodal large language models (MLLMs) in astronomical image understanding.<n>AstroMMBench comprises 621 multiple-choice questions across six astrophysical subfields, curated and reviewed by 15 domain experts for quality and relevance.<n>Results show that Ovis2-34B achieved the highest overall accuracy (70.5%), demonstrating leading capabilities even compared to strong closed-source models.
arXiv Detail & Related papers (2025-09-29T09:02:30Z)
AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model [3.911100968725141]
General-purpose large language models often struggle with specialized domain knowledge.<n>This study introduces AstroSage-70B, a significantly larger and more advanced domain-specialized natural-language AI assistant.<n>It is designed for research and education across astronomy, astrophysics, space science, astroparticle physics, cosmology, and astronomical instrumentation.
arXiv Detail & Related papers (2025-05-23T07:58:50Z)
Scalable Cosmic AI Inference using Cloud Serverless Computing with FMI [0.35337216626844875]
Large-scale astronomical image data processing and prediction is essential for astronomers. Modern deep learning models offer high predictive accuracy, but they often demand substantial computational resources. We introduce the Cloud-based Astronomy Inference framework to address these challenges.
arXiv Detail & Related papers (2025-01-08T20:50:56Z)
AstroM$^3$: A self-supervised multimodal model for astronomy [0.0]
We propose AstroM$3$, a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously. Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. Results demonstrate that CLIP pre-training improves classification performance for time-series photometry, where accuracy increases from 84.6% to 91.5%.
arXiv Detail & Related papers (2024-11-13T18:20:29Z)
Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch [54.12139707822201]
We propose ScaleQuest, a novel, scalable, and cost-effective data synthesis method.<n>By generating diverse questions from scratch, we produce a dataset of 1 million problem-solution pairs.<n>Our experiments demonstrate that models trained on our data outperform existing open-source datasets.
arXiv Detail & Related papers (2024-10-24T12:42:04Z)
X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection [55.77552681618732]
X2-DFD is an eXplainable and eXtendable framework based on multimodal large-language models (MLLMs) for deepfake detection.<n>The first stage, Model Feature Assessment, systematically evaluates the detectability of forgery-related features for the MLLM.<n>The second stage, Explainable dataset Construction, consists of two key modules: Strong Feature Strengthening and Weak Feature Supplementing.<n>The third stage, Fine-tuning and Inference, involves fine-tuning the MLLM on the constructed dataset and deploying it for final detection and explanation.
arXiv Detail & Related papers (2024-10-08T15:28:33Z)
AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy [4.729846733874557]
This study aims to quantitatively assess specialized LLMs in astronomy. We find that the previously released AstroLLaMA series, based on LLaMA-2-7B, underperforms compared to the base model. Despite the observed catastrophic forgetting in smaller models, our results indicate that continual pretraining on the 70B model can yield significant improvements.
arXiv Detail & Related papers (2024-09-29T16:02:22Z)
TopoChat: Enhancing Topological Materials Retrieval With Large Language Model and Multi-Source Knowledge [4.654635844923322]
Large language models (LLMs) have demonstrated impressive performance in the text generation task. We develop a specialized dialogue system for topological materials called TopoChat. TopoChat exhibits superior performance in structural and property querying, material recommendation, and complex relational reasoning.
arXiv Detail & Related papers (2024-09-10T06:01:16Z)
Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program [42.87968485876435]
This study explores the use of fine-tuned Large Language Models (LLMs) for autonomous spacecraft control. We demonstrate how these models can effectively control spacecraft using language-based inputs and outputs.
arXiv Detail & Related papers (2024-08-16T11:43:31Z)
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models [70.01883340129204]
spatial reasoning is a crucial component of both biological and artificial intelligence. We present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
arXiv Detail & Related papers (2024-06-07T01:06:34Z)
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions [58.57255822646756]
This paper introduces MathChat, a benchmark designed to evaluate large language models (LLMs) across a broader spectrum of mathematical tasks. We evaluate the performance of various SOTA LLMs on the MathChat benchmark, and we observe that while these models excel in single turn question answering, they significantly underperform in more complex scenarios. We develop MathChat sync, a synthetic dialogue based math dataset for LLM finetuning, focusing on improving models' interaction and instruction following capabilities in conversations.
arXiv Detail & Related papers (2024-05-29T18:45:55Z)
Weak-to-Strong Extrapolation Expedites Alignment [135.12769233630362]
We propose a method called ExPO to boost models' alignment with human preference. We demonstrate that ExPO consistently improves off-the-shelf DPO/RLHF models. We shed light on the essence of ExPO amplifying the reward signal learned during alignment training.
arXiv Detail & Related papers (2024-04-25T17:39:50Z)
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models. One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets. We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA) Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z)
LLaMA Pro: Progressive LLaMA with Block Expansion [66.39213657252279]
We propose a new post-pretraining method for Large Language Models (LLMs) with an expansion of Transformer blocks. We tune the expanded blocks using only new corpus, efficiently and effectively improving the model's knowledge without catastrophic forgetting. In this paper, we experiment on the corpus of code and math, yielding LLaMA Pro-8.3B, a versatile foundation model from LLaMA2-7B.
arXiv Detail & Related papers (2024-01-04T18:59:12Z)
AstroLLaMA: Towards Specialized Foundation Models in Astronomy [1.1694367694169385]
We introduce AstroLLaMA, a 7-billion- parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
arXiv Detail & Related papers (2023-09-12T11:02:27Z)
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents [103.28404907655542]
Large language models (LLMs) have led to the emerging exploration of Autonomous Agents (LAAs) This paper provides a comprehensive comparison of LAA in terms of both agent architectures and LLM backbones. We propose a new strategy to orchestrate multiple LAAs such that each labor LAA focuses on one type of action, textiti.e. BOLAA, where a controller manages the communication among multiple agents.
arXiv Detail & Related papers (2023-08-11T06:37:54Z)
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations [91.98516412612739]
We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat. Our objective is to capture the breadth of interactions that a human might have with an AI assistant. We fine-tune a LLaMA model to create a powerful conversational model, UltraLLaMA.
arXiv Detail & Related papers (2023-05-23T16:49:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.