Large Language Models: New Opportunities for Access to Science
- URL: http://arxiv.org/abs/2501.07250v1
- Date: Mon, 13 Jan 2025 11:58:27 GMT
- Title: Large Language Models: New Opportunities for Access to Science
- Authors: Jutta Schnabel,
- Abstract summary: The uptake of Retrieval Augmented Generation-enhanced chat applications in the construction of the open science environment of the KM3NeT neutrino detectors serves as a focus point to explore and exemplify prospects for the wider application of Large Language Models for our science.
- Score: 0.0
- License:
- Abstract: The adaptation of Large Language Models like ChatGPT for information retrieval from scientific data, software and publications is offering new opportunities to simplify access to and understanding of science for persons from all levels of expertise. They can become tools to both enhance the usability of the open science environment we are building as well as help to provide systematic insight to a long-built corpus of scientific publications. The uptake of Retrieval Augmented Generation-enhanced chat applications in the construction of the open science environment of the KM3NeT neutrino detectors serves as a focus point to explore and exemplify prospects for the wider application of Large Language Models for our science.
Related papers
- ByteScience: Bridging Unstructured Scientific Literature and Structured Data with Auto Fine-tuned Large Language Model in Token Granularity [13.978222668670192]
ByteScience is a non-profit cloud-based auto fine-tuned Large Language Model (LLM) platform.
It is designed to extract structured scientific data and synthesize new scientific knowledge from vast scientific corpora.
The platform achieves remarkable accuracy with only a small amount of well-annotated articles.
arXiv Detail & Related papers (2024-11-18T19:36:26Z) - Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding [0.0]
This project investigates the efficacy of Large Language Models (LLMs) in understanding and extracting scientific knowledge across specific domains.
We employ pre-trained models and fine-tune them on datasets in the scientific domain.
arXiv Detail & Related papers (2024-08-04T01:32:09Z) - A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery [68.48094108571432]
Large language models (LLMs) have revolutionized the way text and other modalities of data are handled.
We aim to provide a more holistic view of the research landscape by unveiling cross-field and cross-modal connections between scientific LLMs.
arXiv Detail & Related papers (2024-06-16T08:03:24Z) - LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations.
We introduce Scientific Generative Agent (SGA), a bilevel optimization framework.
We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z) - Scientific Large Language Models: A Survey on Biological & Chemical Domains [47.97810890521825]
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension.
The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines.
As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration.
arXiv Detail & Related papers (2024-01-26T05:33:34Z) - Understanding Practices around Computational News Discovery Tools in the
Domain of Science Journalism [3.660182910533372]
We explore computational methods to aid these journalists' news discovery in terms of time-efficiency and agency.
We prototyped three computational information subsidies into an interactive tool that we used as a probe to better understand how such a tool may offer utility.
Our findings contribute a richer view of the sociotechnical system around computational news discovery tools, and suggest ways to improve such tools to better support the practices of science journalists.
arXiv Detail & Related papers (2023-11-12T14:47:50Z) - Large Language Models for Scientific Synthesis, Inference and
Explanation [56.41963802804953]
We show how large language models can perform scientific synthesis, inference, and explanation.
We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature.
This approach has the further advantage that the large language model can explain the machine learning system's predictions.
arXiv Detail & Related papers (2023-10-12T02:17:59Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - KnowledgeShovel: An AI-in-the-Loop Document Annotation System for
Scientific Knowledge Base Construction [46.56643271476249]
KnowledgeShovel is an Al-in-the-Loop document annotation system for researchers to construct scientific knowledge bases.
The design of KnowledgeShovel introduces a multi-step multi-modalAI collaboration pipeline to improve data accuracy while reducing the human burden.
A follow-up user evaluation with 7 geoscience researchers shows that KnowledgeShovel can enable efficient construction of scientific knowledge bases with satisfactory accuracy.
arXiv Detail & Related papers (2022-10-06T11:38:18Z) - Semantic and Relational Spaces in Science of Science: Deep Learning
Models for Article Vectorisation [4.178929174617172]
We focus on document-level embeddings based on the semantic and relational aspects of articles, using Natural Language Processing (NLP) and Graph Neural Networks (GNNs)
Our results show that using NLP we can encode a semantic space of articles, while with GNN we are able to build a relational space where the social practices of a research community are also encoded.
arXiv Detail & Related papers (2020-11-05T14:57:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.