Improving astroBERT using Semantic Textual Similarity
- URL: http://arxiv.org/abs/2212.00744v1
- Date: Tue, 29 Nov 2022 16:15:32 GMT
- Title: Improving astroBERT using Semantic Textual Similarity
- Authors: Felix Grezes, Thomas Allen, Sergi Blanco-Cuaresma, Alberto Accomazzi,
Michael J. Kurtz, Golnaz Shapurian, Edwin Henneken, Carolyn S. Grant, Donna
M. Thompson, Timothy W. Hostetler, Matthew R. Templeton, Kelly E. Lockhart,
Shinyi Chen, Jennifer Koch, Taylor Jacovich, and Pavlos Protopapas
- Abstract summary: We introduce astroBERT, a machine learning language model tailored to the text used in astronomy papers in NASA's Astrophysics Data System (ADS)
We show how astroBERT improves over existing public language models on astrophysics specific tasks.
We detail how ADS plans to harness the unique structure of scientific papers, the citation graph and citation context to further improve astroBERT.
- Score: 0.785116730789274
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The NASA Astrophysics Data System (ADS) is an essential tool for researchers
that allows them to explore the astronomy and astrophysics scientific
literature, but it has yet to exploit recent advances in natural language
processing. At ADASS 2021, we introduced astroBERT, a machine learning language
model tailored to the text used in astronomy papers in ADS. In this work we:
- announce the first public release of the astroBERT language model;
- show how astroBERT improves over existing public language models on
astrophysics specific tasks;
- and detail how ADS plans to harness the unique structure of scientific
papers, the citation graph and citation context, to further improve astroBERT.
Related papers
- Delving into the Utilisation of ChatGPT in Scientific Publications in Astronomy [0.0]
We show that ChatGPT uses more often than humans when generating academic text and search a total of 1 million articles for them.
We identify a list of words favoured by ChatGPT and find a statistically significant increase for these words against a control group in 2024.
These results suggest a widespread adoption of these models in the writing of astronomy papers.
arXiv Detail & Related papers (2024-06-25T07:15:10Z) - SciGLM: Training Scientific Language Models with Self-Reflective
Instruction Annotation and Tuning [60.14510984576027]
SciGLM is a suite of scientific language models able to conduct college-level scientific reasoning.
We apply a self-reflective instruction annotation framework to generate step-by-step reasoning for unlabelled scientific questions.
We fine-tuned the ChatGLM family of language models with SciInstruct, enhancing their scientific and mathematical reasoning capabilities.
arXiv Detail & Related papers (2024-01-15T20:22:21Z) - AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse
Datasets [7.53209156977206]
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training.
We achieve notable improvements in specialized topic comprehension using a curated set of astronomy corpora.
We present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use.
arXiv Detail & Related papers (2024-01-03T04:47:02Z) - GeoGalactica: A Scientific Large Language Model in Geoscience [95.15911521220052]
Large language models (LLMs) have achieved huge success for their general knowledge and ability to solve a wide spectrum of tasks in natural language processing (NLP)
We specialize an LLM into geoscience, by further pre-training the model with a vast amount of texts in geoscience, as well as supervised fine-tuning (SFT) the resulting model with our custom collected instruction tuning dataset.
We train GeoGalactica over a geoscience-related text corpus containing 65 billion tokens, preserving as the largest geoscience-specific text corpus.
Then we fine-tune the model with 1 million pairs of instruction-tuning
arXiv Detail & Related papers (2023-12-31T09:22:54Z) - Large Language Models for Scientific Synthesis, Inference and
Explanation [56.41963802804953]
We show how large language models can perform scientific synthesis, inference, and explanation.
We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature.
This approach has the further advantage that the large language model can explain the machine learning system's predictions.
arXiv Detail & Related papers (2023-10-12T02:17:59Z) - AstroLLaMA: Towards Specialized Foundation Models in Astronomy [1.1694367694169385]
We introduce AstroLLaMA, a 7-billion- parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv.
Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models.
Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
arXiv Detail & Related papers (2023-09-12T11:02:27Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - Speech-to-Speech Translation For A Real-world Unwritten Language [62.414304258701804]
We study speech-to-speech translation (S2ST) that translates speech from one language into another language.
We present an end-to-end solution from training data collection, modeling choices to benchmark dataset release.
arXiv Detail & Related papers (2022-11-11T20:21:38Z) - Astronomia ex machina: a history, primer, and outlook on neural networks
in astronomy [0.0]
We trace the evolution of connectionism in astronomy through its three waves.
We argue for the adoption of GPT-like foundation models fine-tuned for astronomical applications.
arXiv Detail & Related papers (2022-11-07T19:00:00Z) - Building astroBERT, a language model for Astronomy & Astrophysics [1.4587241287997816]
We are applying modern machine learning and natural language processing techniques to NASA Astrophysics Data System (ADS) dataset.
We are training astroBERT, a deeply contextual language model based on research at Google.
Using astroBERT, we aim to enrich the ADS dataset and improve its discoverability, and in particular we are developing our own named entity recognition tool.
arXiv Detail & Related papers (2021-12-01T16:01:46Z) - First Full-Event Reconstruction from Imaging Atmospheric Cherenkov
Telescope Real Data with Deep Learning [55.41644538483948]
The Cherenkov Telescope Array is the future of ground-based gamma-ray astronomy.
Its first prototype telescope built on-site, the Large Size Telescope 1, is currently under commissioning and taking its first scientific data.
We present for the first time the development of a full-event reconstruction based on deep convolutional neural networks and its application to real data.
arXiv Detail & Related papers (2021-05-31T12:51:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.