Related papers: De-jargonizing Science for Journalists with GPT-4: A Pilot Study

De-jargonizing Science for Journalists with GPT-4: A Pilot Study

URL: http://arxiv.org/abs/2410.12069v1
Date: Tue, 15 Oct 2024 21:10:01 GMT
Title: De-jargonizing Science for Journalists with GPT-4: A Pilot Study
Authors: Sachita Nishal, Eric Lee, Nicholas Diakopoulos,
Abstract summary: The system achieves fairly high recall in identifying jargon and preserves relative differences in readers' jargon identification. The findings highlight the potential of generative AI for assisting science reporters, and can inform future work on developing tools to simplify dense documents.
Score: 3.730699089967391
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This study offers an initial evaluation of a human-in-the-loop system leveraging GPT-4 (a large language model or LLM), and Retrieval-Augmented Generation (RAG) to identify and define jargon terms in scientific abstracts, based on readers' self-reported knowledge. The system achieves fairly high recall in identifying jargon and preserves relative differences in readers' jargon identification, suggesting personalization as a feasible use-case for LLMs to support sense-making of complex information. Surprisingly, using only abstracts for context to generate definitions yields slightly more accurate and higher quality definitions than using RAG-based context from the fulltext of an article. The findings highlight the potential of generative AI for assisting science reporters, and can inform future work on developing tools to simplify dense documents.

Related papers

A Socratic RAG Approach to Connect Natural Language Queries on Research Topics with Knowledge Organization Systems [0.3782392304044599]
We propose a Retrieval Augmented Generation (RAG) agent that maps natural language queries about research topics to machine-interpretable semantic entities. Our approach combines RAG with Socratic dialogue to align a user's intuitive understanding of research topics with established Knowledge Organization Systems.
arXiv Detail & Related papers (2025-02-20T19:58:59Z)
SteLLA: A Structured Grading System Using LLMs with RAG [2.630522349105014]
We present SteLLA (Structured Grading System Using LLMs with RAG) in which a) Retrieval Augmented Generation (RAG) is used to empower LLMs on the ASAG task. A real-world dataset that contains students' answers in an exam was collected from a college-level Biology course. Experiments show that our proposed system can achieve substantial agreement with the human grader while providing break-down grades and feedback on all the knowledge points examined in the problem.
arXiv Detail & Related papers (2025-01-15T19:24:48Z)
Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts [53.421616210871704]
Lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task.
arXiv Detail & Related papers (2024-10-28T05:56:51Z)
GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input. GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z)
Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting [59.97247234955861]
We introduce a novel framework based on large language models (LLMs) that combines a progressive prompting algorithm with a dual-agent system, named LLM-Duo. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain.
arXiv Detail & Related papers (2024-08-20T16:42:23Z)
Large Language Models for Scientific Information Extraction: An Empirical Study for Virology [0.0]
We champion the use of structured and semantic content representation of discourse-based scholarly communication. Inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions, we develop an automated approach to produce structured scholarly contribution summaries. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.
arXiv Detail & Related papers (2024-01-18T15:04:55Z)
Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems [58.561904356651276]
We introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework to improve the semantic understanding of entities for Conversational recommender systems. KERL uses a knowledge graph and a pre-trained language model to improve the semantic understanding of entities. KERL achieves state-of-the-art results in both recommendation and response generation tasks.
arXiv Detail & Related papers (2023-12-18T06:41:23Z)
Personalized Jargon Identification for Enhanced Interdisciplinary Communication [22.999616448996303]
Current methods of jargon identification mainly use corpus-level familiarity indicators. We collect a dataset of over 10K term familiarity annotations from 11 computer science researchers. We investigate features representing individual, sub-domain, and domain knowledge to predict individual jargon familiarity.
arXiv Detail & Related papers (2023-11-16T00:51:25Z)
Large Language Models for Information Retrieval: A Survey [58.30439850203101]
Information retrieval has evolved from term-based methods to its integration with advanced neural models. Recent research has sought to leverage large language models (LLMs) to improve IR systems. We delve into the confluence of LLMs and IR systems, including crucial aspects such as query rewriters, retrievers, rerankers, and readers.
arXiv Detail & Related papers (2023-08-14T12:47:22Z)
Text Simplification of Scientific Texts for Non-Expert Readers [3.4761212729163318]
Simplification of scientific abstracts helps non-experts to access the core information. This is especially relevant for, e.g., cancer patients reading about novel treatment options.
arXiv Detail & Related papers (2023-07-07T13:05:11Z)
Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases. Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding. This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z)
Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents [2.246222223318928]
This paper employs a novel approach using a generative language model (GPT-4) to produce labels and rationales for large-scale text analysis. We collect a database comprising 154,934 patent documents using an advanced Boolean query submitted to InnovationQ+. We design a framework for identifying and labeling public value expressions in these AI patent sentences.
arXiv Detail & Related papers (2023-05-17T17:18:26Z)
CitationIE: Leveraging the Citation Graph for Scientific Information Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers. We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z)
NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature [0.0]
We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles. We develop the annotation task based on a pilot exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development.
arXiv Detail & Related papers (2020-06-23T10:04:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.