Expressing High-Level Scientific Claims with Formal Semantics
- URL: http://arxiv.org/abs/2109.12907v1
- Date: Mon, 27 Sep 2021 09:52:49 GMT
- Title: Expressing High-Level Scientific Claims with Formal Semantics
- Authors: Cristina-Iulia Bucur and Tobias Kuhn and Davide Ceolin and Jacco van
Ossenbruggen
- Abstract summary: We analyze the main claims from a sample of scientific articles from all disciplines.
We find that their semantics are more complex than what a straight-forward application of formalisms like RDF or OWL account for.
We show here how the instantiation of the five slots of this super-pattern leads to a strictly defined statement in higher-order logic.
- Score: 0.8258451067861932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of semantic technologies is gaining significant traction in science
communication with a wide array of applications in disciplines including the
Life Sciences, Computer Science, and the Social Sciences. Languages like RDF,
OWL, and other formalisms based on formal logic are applied to make scientific
knowledge accessible not only to human readers but also to automated systems.
These approaches have mostly focused on the structure of scientific
publications themselves, on the used scientific methods and equipment, or on
the structure of the used datasets. The core claims or hypotheses of scientific
work have only been covered in a shallow manner, such as by linking mentioned
entities to established identifiers. In this research, we therefore want to
find out whether we can use existing semantic formalisms to fully express the
content of high-level scientific claims using formal semantics in a systematic
way. Analyzing the main claims from a sample of scientific articles from all
disciplines, we find that their semantics are more complex than what a
straight-forward application of formalisms like RDF or OWL account for, but we
managed to elicit a clear semantic pattern which we call the 'super-pattern'.
We show here how the instantiation of the five slots of this super-pattern
leads to a strictly defined statement in higher-order logic. We successfully
applied this super-pattern to an enlarged sample of scientific claims. We show
that knowledge representation experts, when instructed to independently
instantiate the super-pattern with given scientific claims, show a high degree
of consistency and convergence given the complexity of the task and the
subject. These results therefore open the door for expressing high-level
scientific findings in a manner they can be automatically interpreted, which on
the longer run can allow us to do automated consistency checking, and much
more.
Related papers
- LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations.
We introduce Scientific Generative Agent (SGA), a bilevel optimization framework.
We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z) - SciGLM: Training Scientific Language Models with Self-Reflective
Instruction Annotation and Tuning [60.14510984576027]
SciGLM is a suite of scientific language models able to conduct college-level scientific reasoning.
We apply a self-reflective instruction annotation framework to generate step-by-step reasoning for unlabelled scientific questions.
We fine-tuned the ChatGLM family of language models with SciInstruct, enhancing their scientific and mathematical reasoning capabilities.
arXiv Detail & Related papers (2024-01-15T20:22:21Z) - An Interdisciplinary Outlook on Large Language Models for Scientific
Research [3.4108358650013573]
We describe the capabilities and constraints of Large Language Models (LLMs) within disparate academic disciplines, aiming to delineate their strengths and limitations with precision.
We examine how LLMs augment scientific inquiry, offering concrete examples such as accelerating literature review by summarizing vast numbers of publications.
We articulate the challenges LLMs face, including their reliance on extensive and sometimes biased datasets, and the potential ethical dilemmas stemming from their use.
arXiv Detail & Related papers (2023-11-03T19:41:09Z) - Large Language Models for Scientific Synthesis, Inference and
Explanation [56.41963802804953]
We show how large language models can perform scientific synthesis, inference, and explanation.
We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature.
This approach has the further advantage that the large language model can explain the machine learning system's predictions.
arXiv Detail & Related papers (2023-10-12T02:17:59Z) - Large Language Models for Automated Open-domain Scientific Hypotheses Discovery [50.40483334131271]
This work proposes the first dataset for social science academic hypotheses discovery.
Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity.
A multi- module framework is developed for the task, including three different feedback mechanisms to boost performance.
arXiv Detail & Related papers (2023-09-06T05:19:41Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - AI Research Associate for Early-Stage Scientific Discovery [1.6861004263551447]
Artificial intelligence (AI) has been increasingly applied in scientific activities for decades.
We present an AI research associate for early-stage scientific discovery based on a novel minimally-biased physics-based modeling.
arXiv Detail & Related papers (2022-02-02T17:05:52Z) - Automated Creation and Human-assisted Curation of Computable Scientific
Models from Code and Text [2.3746609573239756]
Domain experts cannot gain a complete understanding of the implementation of a scientific model if they are not familiar with the code.
We develop a system for the automated creation and human-assisted curation of scientific models.
We present experimental results obtained using a dataset of code and associated text derived from NASA's Hypersonic Aerodynamics website.
arXiv Detail & Related papers (2022-01-28T17:31:38Z) - Fact-driven Logical Reasoning for Machine Reading Comprehension [82.58857437343974]
We are motivated to cover both commonsense and temporary knowledge clues hierarchically.
Specifically, we propose a general formalism of knowledge units by extracting backbone constituents of the sentence.
We then construct a supergraph on top of the fact units, allowing for the benefit of sentence-level (relations among fact groups) and entity-level interactions.
arXiv Detail & Related papers (2021-05-21T13:11:13Z) - Knowledge Elicitation using Deep Metric Learning and Psychometric
Testing [15.989397781243225]
We provide a method for efficient hierarchical knowledge elicitation from experts working with high-dimensional data such as images or videos.
The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure.
arXiv Detail & Related papers (2020-04-14T08:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.