A Content-Based Novelty Measure for Scholarly Publications: A Proof of
Concept
- URL: http://arxiv.org/abs/2401.03642v2
- Date: Tue, 16 Jan 2024 01:05:59 GMT
- Title: A Content-Based Novelty Measure for Scholarly Publications: A Proof of
Concept
- Authors: Haining Wang
- Abstract summary: We introduce an information-theoretic measure of novelty in scholarly publications.
This measure quantifies the degree of'surprise' perceived by a language model that represents the word distribution of scholarly discourse.
- Score: 9.148691357200216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novelty, akin to gene mutation in evolution, opens possibilities for
scholarly advancement. Although peer review remains the gold standard for
evaluating novelty in scholarly communication and resource allocation, the vast
volume of submissions necessitates an automated measure of scholarly novelty.
Adopting a perspective that views novelty as the atypical combination of
existing knowledge, we introduce an information-theoretic measure of novelty in
scholarly publications. This measure quantifies the degree of 'surprise'
perceived by a language model that represents the word distribution of
scholarly discourse. The proposed measure is accompanied by face and construct
validity evidence; the former demonstrates correspondence to scientific common
sense, and the latter is endorsed through alignment with novelty evaluations
from a select panel of domain experts. Additionally, characterized by its
interpretability, fine granularity, and accessibility, this measure addresses
gaps prevalent in existing methods. We believe this measure holds great
potential to benefit editors, stakeholders, and policymakers, and it provides a
reliable lens for examining the relationship between novelty and academic
dynamics such as creativity, interdisciplinarity, and scientific advances.
Related papers
- Citation Structural Diversity: A Novel and Concise Metric Combining Structure and Semantics for Literature Evaluation [0.562479170374811]
The study examines the influence of the proposed model of citation structural diversity on citation volume and long-term academic impact.
The findings reveal that literature with higher citation structural diversity demonstrates notable advantages in both citation frequency and sustained academic influence.
arXiv Detail & Related papers (2025-01-05T03:24:37Z) - Discovering emergent connections in quantum physics research via dynamic word embeddings [0.562479170374811]
We introduce a novel approach based on dynamic word embeddings for concept combination prediction.
Unlike knowledge graphs, our method captures implicit relationships between concepts, can be learned in a fully unsupervised manner, and encodes a broader spectrum of information.
Our findings suggest that this representation offers a more flexible and informative way of modeling conceptual relationships in scientific literature.
arXiv Detail & Related papers (2024-11-10T19:45:59Z) - A Survey on Emergent Language [9.823821010022932]
The paper provides a comprehensive review of 181 scientific publications on emergent language in artificial intelligence.
Its objective is to serve as a reference for researchers interested in or proficient in the field.
arXiv Detail & Related papers (2024-09-04T12:22:05Z) - Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases.
Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z) - A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [55.33653554387953]
Pattern Analysis and Machine Intelligence (PAMI) has led to numerous literature reviews aimed at collecting and fragmented information.
This paper presents a thorough analysis of these literature reviews within the PAMI field.
We try to address three core research questions: (1) What are the prevalent structural and statistical characteristics of PAMI literature reviews; (2) What strategies can researchers employ to efficiently navigate the growing corpus of reviews; and (3) What are the advantages and limitations of AI-generated reviews compared to human-authored ones.
arXiv Detail & Related papers (2024-02-20T11:28:50Z) - BBScore: A Brownian Bridge Based Metric for Assessing Text Coherence [20.507596002357655]
Coherent texts inherently manifest a sequential and cohesive interplay among sentences.
BBScore is a reference-free metric grounded in Brownian bridge theory for assessing text coherence.
arXiv Detail & Related papers (2023-12-28T08:34:17Z) - Knowledge Graph Context-Enhanced Diversified Recommendation [53.3142545812349]
This research explores the realm of diversified RecSys within the intricate context of knowledge graphs (KG)
Our contributions include introducing an innovative metric, Entity Coverage, and Relation Coverage, which effectively quantifies diversity within the KG domain.
In tandem with this, we introduce a novel technique named Conditional Alignment and Uniformity (CAU) which encodes KG item embeddings while preserving contextual integrity.
arXiv Detail & Related papers (2023-10-20T03:18:57Z) - Exploring and Verbalizing Academic Ideas by Concept Co-occurrence [42.16213986603552]
This study devises a framework based on concept co-occurrence for academic idea inspiration.
We construct evolving concept graphs according to the co-occurrence relationship of concepts from 20 disciplines or topics.
We generate a description of an idea based on a new data structure called co-occurrence citation quintuple.
arXiv Detail & Related papers (2023-06-04T07:01:30Z) - SciMON: Scientific Inspiration Machines Optimized for Novelty [68.46036589035539]
We explore and enhance the ability of neural language models to generate novel scientific directions grounded in literature.
We take a dramatic departure with a novel setting in which models use as input background contexts.
We present SciMON, a modeling framework that uses retrieval of "inspirations" from past scientific papers.
arXiv Detail & Related papers (2023-05-23T17:12:08Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - What's New? Summarizing Contributions in Scientific Literature [85.95906677964815]
We introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work.
We extend the S2ORC corpus of academic articles by adding disentangled "contribution" and "context" reference labels.
We propose a comprehensive automatic evaluation protocol which reports the relevance, novelty, and disentanglement of generated outputs.
arXiv Detail & Related papers (2020-11-06T02:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.