A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports
- URL: http://arxiv.org/abs/2504.12350v1
- Date: Tue, 15 Apr 2025 20:54:19 GMT
- Title: A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports
- Authors: Jing Wang, Jeremy C Weiss,
- Abstract summary: We present a system that transforms case reports into textual time series-structured pairs of textual events and timestamps.<n>This work may serve as a benchmark for leveraging the PMOA corpus for temporal analytics.
- Score: 10.869574822060553
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Timing of clinical events is central to characterization of patient trajectories, enabling analyses such as process tracing, forecasting, and causal reasoning. However, structured electronic health records capture few data elements critical to these tasks, while clinical reports lack temporal localization of events in structured form. We present a system that transforms case reports into textual time series-structured pairs of textual events and timestamps. We contrast manual and large language model (LLM) annotations (n=320 and n=390 respectively) of ten randomly-sampled PubMed open-access (PMOA) case reports (N=152,974) and assess inter-LLM agreement (n=3,103; N=93). We find that the LLM models have moderate event recall(O1-preview: 0.80) but high temporal concordance among identified events (O1-preview: 0.95). By establishing the task, annotation, and assessment systems, and by demonstrating high concordance, this work may serve as a benchmark for leveraging the PMOA corpus for temporal analytics.
Related papers
- Temporal Entailment Pretraining for Clinical Language Models over EHR Data [9.584923572354045]
We introduce a novel temporal entailment pretraining objective for language models in the clinical domain.
Our method formulates EHR segments as temporally ordered sentence pairs and trains the model to determine whether a later state is entailed by, contradictory to, or neutral with respect to an earlier state.
arXiv Detail & Related papers (2025-04-25T07:30:38Z) - Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis [7.734726150561087]
Clinical case reports and discharge summaries may be the most complete and accurate summarization of patient encounters, yet they are finalized, i.e., timestamped after the encounter.<n>We construct a pipeline to phenotype, extract, and annotate time-localized findings within case reports using large language models.
arXiv Detail & Related papers (2025-04-12T03:07:44Z) - Unlocking Multimodal Integration in EHRs: A Prompt Learning Framework for Language and Time Series Fusion [27.70300880284899]
Large language models (LLMs) have shown remarkable performance in vision-language tasks, but their application in the medical field remains underexplored.<n>We introduce ProMedTS, a novel self-supervised multimodal framework that employs prompt-guided learning to unify data types.<n>We evaluate ProMedTS on disease diagnosis tasks using real-world datasets, and the results demonstrate that our method consistently outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-19T07:56:48Z) - Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives [84.03001845263]
Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management.
Traditional narrative analysis often focuses on local indicators in microstructure, such as word usage and syntax.
We propose to investigate specific cognitive and linguistic challenges by analyzing topical shifts, temporal dynamics, and the coherence of narratives over time.
arXiv Detail & Related papers (2025-01-07T12:16:26Z) - Retrieval of Temporal Event Sequences from Textual Descriptions [0.0]
We introduce TESRBench, a benchmark for temporal event sequence retrieval from textual descriptions.<n>We propose TPP-Embedding, a novel model for embedding and retrieving event sequences.<n> TPP-Embedding demonstrates superior performance over baseline models across TESRBench datasets.
arXiv Detail & Related papers (2024-10-17T21:35:55Z) - GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models [0.08192907805418582]
Cyber timeline analysis is crucial in Digital Forensics and Incident Response (DFIR)<n>Traditional methods rely on structured artefacts, such as logs and metadata, for evidence identification and feature extraction.<n>This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent.
arXiv Detail & Related papers (2024-09-04T09:46:33Z) - Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding [57.62275091656578]
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE)
This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE.
arXiv Detail & Related papers (2024-06-04T16:42:17Z) - SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting [63.01035584154509]
We develop a fully automated pipeline and construct a large-scale dataset named MidEast-TE from about 0.6 million news articles.
This dataset focuses on the cooperation and conflict events among countries mainly in the MidEast region from 2015 to 2022.
We propose a novel method LoGo that is able to take advantage of both Local and Global contexts for SCTc-TE forecasting.
arXiv Detail & Related papers (2023-12-02T07:40:21Z) - CenTime: Event-Conditional Modelling of Censoring in Survival Analysis [49.44664144472712]
We introduce CenTime, a novel approach to survival analysis that directly estimates the time to event.
Our method features an innovative event-conditional censoring mechanism that performs robustly even when uncensored data is scarce.
Our results indicate that CenTime offers state-of-the-art performance in predicting time-to-death while maintaining comparable ranking performance.
arXiv Detail & Related papers (2023-09-07T17:07:33Z) - Linking Across Data Granularity: Fitting Multivariate Hawkes Processes to Partially Interval-Censored Data [50.63666649894571]
In some applications, timestamps of individual events in some dimensions are unobservable, and only event counts within intervals are known.
In this study, we introduce a novel point process which shares parameter equivalence with the MHP and can effectively model both timestamped and interval-censored data.
We demonstrate the capabilities of the PCMHP using synthetic and real-world datasets.
arXiv Detail & Related papers (2021-11-03T08:25:35Z) - CSDI: Conditional Score-based Diffusion Models for Probabilistic Time
Series Imputation [107.63407690972139]
Conditional Score-based Diffusion models for Imputation (CSDI) is a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.
CSDI improves by 40-70% over existing probabilistic imputation methods on popular performance metrics.
In addition, C reduces the error by 5-20% compared to the state-of-the-art deterministic imputation methods.
arXiv Detail & Related papers (2021-07-07T22:20:24Z) - Temporal Cascade and Structural Modelling of EHRs for Granular
Readmission Prediction [10.943928059802174]
We propose a novel model, MEDCAS, to model temporal cascade relationships.
MEDCAS integrates point processes in modelling visit types and time gaps into an attention-based sequence-to-sequence learning model.
Experiments on three real-world EHR datasets have been performed and the results demonstrate ttexttMEDCAS outperforms state-of-the-art models in both tasks.
arXiv Detail & Related papers (2021-02-04T13:02:04Z) - MSED: a multi-modal sleep event detection model for clinical sleep
analysis [62.997667081978825]
We designed a single deep neural network architecture to jointly detect sleep events in a polysomnogram.
The performance of the model was quantified by F1, precision, and recall scores, and by correlating index values to clinical values.
arXiv Detail & Related papers (2021-01-07T13:08:44Z) - Clinical Temporal Relation Extraction with Probabilistic Soft Logic
Regularization and Global Inference [50.029659413650194]
Existing methods either require expensive feature engineering or are incapable of modeling the global dependencies among the events.
In this paper, we propose a novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference.
arXiv Detail & Related papers (2020-12-16T08:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.