Measuring Sentence-Level and Aspect-Level (Un)certainty in Science
Communications
- URL: http://arxiv.org/abs/2109.14776v1
- Date: Thu, 30 Sep 2021 00:50:51 GMT
- Title: Measuring Sentence-Level and Aspect-Level (Un)certainty in Science
Communications
- Authors: Jiaxin Pei, David Jurgens
- Abstract summary: We introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings.
We show that both the overall certainty and individual aspects can be predicted with pre-trained language models.
- Score: 9.36599317326032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Certainty and uncertainty are fundamental to science communication. Hedges
have widely been used as proxies for uncertainty. However, certainty is a
complex construct, with authors expressing not only the degree but the type and
aspects of uncertainty in order to give the reader a certain impression of what
is known. Here, we introduce a new study of certainty that models both the
level and the aspects of certainty in scientific findings. Using a new dataset
of 2167 annotated scientific findings, we demonstrate that hedges alone account
for only a partial explanation of certainty. We show that both the overall
certainty and individual aspects can be predicted with pre-trained language
models, providing a more complete picture of the author's intended
communication. Downstream analyses on 431K scientific findings from news and
scientific abstracts demonstrate that modeling sentence-level and aspect-level
certainty is meaningful for areas like science communication. Both the model
and datasets used in this paper are released at
https://blablablab.si.umich.edu/projects/certainty/.
Related papers
- Knowing When Not to Answer: Abstention-Aware Scientific Reasoning [2.680633756465714]
In scientific settings, unsupported or uncertain conclusions can be more harmful than abstaining.<n>We study this problem through an abstention-aware verification framework.<n>We evaluate this framework across two scientific benchmarks: SciFact and PubMedQA.
arXiv Detail & Related papers (2026-02-15T15:29:43Z) - SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines [112.78540935201558]
We present a scientific reasoning foundation model that aligns natural language with heterogeneous scientific representations.<n>The model is pretrained on a 206B-token corpus spanning scientific text, pure sequences, and sequence-text pairs, then aligned via SFT on 40M instructions.<n>It supports four capability families, covering up to 103 tasks across: (i) faithful translation between text and scientific formats, (ii) text/knowledge extraction, (iii) property prediction, (iv) property classification, (v) unconditional and conditional sequence generation and design.
arXiv Detail & Related papers (2025-09-25T17:52:06Z) - ScienceMeter: Tracking Scientific Knowledge Updates in Language Models [79.33626657942169]
Large Language Models (LLMs) are increasingly used to support scientific research, but their knowledge of scientific advancements can quickly become outdated.<n>We introduce ScienceMeter, a new framework for evaluating scientific knowledge update methods over scientific knowledge spanning the past, present, and future.
arXiv Detail & Related papers (2025-05-30T07:28:20Z) - On the definition and importance of interpretability in scientific machine learning [0.0]
Researchers in the physical sciences seek not just predictive models, but also to uncover the fundamental principles that govern a system of interest.<n>We argue that researchers in equation discovery and symbolic regression tend to conflate the concept of sparsity with interpretability.<n>Our notion of interpretability emphasizes understanding of the mechanism over mathematical sparsity.
arXiv Detail & Related papers (2025-05-16T20:16:14Z) - Measuring and Analyzing Subjective Uncertainty in Scientific Communications [1.3154296174423619]
This work measured/analyzed the subjective uncertainty and its impact within scientific communities across different disciplines.
We showed that the level of this type of uncertainty varies significantly across different fields, years of publication and geographical locations.
We also studied the correlation between subjective uncertainty and several metrics, such as number/gender of authors, centrality of the field's community, citation count, etc.
arXiv Detail & Related papers (2025-03-27T03:12:50Z) - Quantum-Like Contextuality in Large Language Models [0.7373617024876724]
This paper provides the first large scale experimental evidence for a yes answer in natural language.
We construct a linguistic schema modelled over a contextual quantum scenario, instantiate it in the Simple English Wikipedia and extract probability distributions for the instances.
We proved that the contextual instances came from semantically similar words, by deriving an equation between degrees of contextuality and Euclidean distances of BERT's embedding vectors.
arXiv Detail & Related papers (2024-12-21T23:46:55Z) - Causal Representation Learning in Temporal Data via Single-Parent Decoding [66.34294989334728]
Scientific research often seeks to understand the causal structure underlying high-level variables in a system.
Scientists typically collect low-level measurements, such as geographically distributed temperature readings.
We propose a differentiable method, Causal Discovery with Single-parent Decoding, that simultaneously learns the underlying latents and a causal graph over them.
arXiv Detail & Related papers (2024-10-09T15:57:50Z) - Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness [106.52630978891054]
We present a taxonomy of uncertainty specific to vision-language AI systems.
We also introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error.
arXiv Detail & Related papers (2024-07-02T04:23:54Z) - Empirical evaluation of Uncertainty Quantification in
Retrieval-Augmented Language Models for Science [0.0]
This study investigates how uncertainty scores vary when scientific knowledge is incorporated as pretraining and retrieval data.
We observe that an existing RALM finetuned with scientific knowledge as the retrieval data tends to be more confident in generating predictions.
We also found that RALMs are overconfident in their predictions, making inaccurate predictions more confidently than accurate ones.
arXiv Detail & Related papers (2023-11-15T20:42:11Z) - Can Large Language Models Discern Evidence for Scientific Hypotheses? Case Studies in the Social Sciences [3.9985385067438344]
A strong hypothesis is a best guess based on existing evidence and informed by a comprehensive view of relevant literature.
With exponential increase in the number of scientific articles published annually, manual aggregation and synthesis of evidence related to a given hypothesis is a challenge.
We share a novel dataset for the task of scientific hypothesis evidencing using community-driven annotations of studies in the social sciences.
arXiv Detail & Related papers (2023-09-07T04:15:17Z) - Large Language Models for Automated Open-domain Scientific Hypotheses Discovery [50.40483334131271]
This work proposes the first dataset for social science academic hypotheses discovery.
Unlike previous settings, the new dataset requires (1) using open-domain data (raw web corpus) as observations; and (2) proposing hypotheses even new to humanity.
A multi- module framework is developed for the task, including three different feedback mechanisms to boost performance.
arXiv Detail & Related papers (2023-09-06T05:19:41Z) - UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text [5.318135784473086]
UnScientify is an interactive system designed to detect scientific uncertainty in scholarly full text.
The pipeline for the system includes a combination of pattern matching, complex sentence checking, and authorial reference checking.
UnScientify provides interpretable results, aiding in the comprehension of identified instances of scientific uncertainty in text.
arXiv Detail & Related papers (2023-07-26T15:04:24Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - SciLander: Mapping the Scientific News Landscape [8.504643390943409]
We introduce SciLander, a method for learning representations of news sources reporting on science-based topics.
We evaluate our method on a novel COVID-19 dataset containing nearly 1M news articles from 500 sources spanning a period of 18 months since the beginning of the pandemic in 2020.
arXiv Detail & Related papers (2022-05-16T20:20:43Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - The Rediscovery Hypothesis: Language Models Need to Meet Linguistics [8.293055016429863]
We study whether linguistic knowledge is a necessary condition for good performance of modern language models.
We show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures.
This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objective with linguistic information.
arXiv Detail & Related papers (2021-03-02T15:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.