MIMICause : Defining, identifying and predicting types of causal
relationships between biomedical concepts from clinical notes
- URL: http://arxiv.org/abs/2110.07090v1
- Date: Thu, 14 Oct 2021 00:15:36 GMT
- Title: MIMICause : Defining, identifying and predicting types of causal
relationships between biomedical concepts from clinical notes
- Authors: Vivek Khetan, Md Imbesat Hassan Rizvi, Jessica Huber, Paige Bartusiak,
Bogdan Sacaleanu, Andrew Fano
- Abstract summary: We propose annotation guidelines, develop an annotated corpus and provide baseline scores to identify types and direction of causal relations between a pair of biomedical concepts in clinical notes.
We annotate a total of 2714 de-identified examples sampled from the 2018 n2c2 shared task dataset and train four different language model based architectures.
The high inter-annotator agreement for clinical text shows the quality of our annotation guidelines while the provided baseline F1 score sets the direction for future research towards understanding narratives in clinical texts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Understanding of causal narratives communicated in clinical notes can help
make strides towards personalized healthcare. In this work, MIMICause, we
propose annotation guidelines, develop an annotated corpus and provide baseline
scores to identify types and direction of causal relations between a pair of
biomedical concepts in clinical notes; communicated implicitly or explicitly,
identified either in a single sentence or across multiple sentences.
We annotate a total of 2714 de-identified examples sampled from the 2018 n2c2
shared task dataset and train four different language model based
architectures. Annotation based on our guidelines achieved a high
inter-annotator agreement i.e. Fleiss' kappa score of 0.72 and our model for
identification of causal relation achieved a macro F1 score of 0.56 on test
data. The high inter-annotator agreement for clinical text shows the quality of
our annotation guidelines while the provided baseline F1 score sets the
direction for future research towards understanding narratives in clinical
texts.
Related papers
- MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval [12.265116154395434]
We present MED-COPILOT, an interactive clinical decision-support system designed for clinicians and medical trainees.<n>The system builds a structured knowledge graph from WHO and NICE guidelines, applies community-level summarization for efficient retrieval, and maintains a 36,000-case similar-patient database.
arXiv Detail & Related papers (2026-02-28T04:32:03Z) - CNSight: Evaluation of Clinical Note Segmentation Tools [3.673249612734457]
We evaluate rule-based baselines, domain-specific transformer models, and large language models for clinical note segmentation using a curated dataset of 1,000 notes from MIMIC-IV.<n>Our experiments show that large API-based models achieve the best overall performance, with GPT-5-mini reaching a best average F1 of 72.4 across sentence-level and freetext segmentation.
arXiv Detail & Related papers (2025-12-28T05:40:15Z) - Self-Supervised Anatomical Consistency Learning for Vision-Grounded Medical Report Generation [61.350584471060756]
Vision-grounded medical report generation aims to produce clinically accurate descriptions of medical images.<n>We propose Self-Supervised Anatomical Consistency Learning (SS-ACL) to align generated reports with corresponding anatomical regions.<n>SS-ACL constructs a hierarchical anatomical graph inspired by the invariant top-down inclusion structure of human anatomy.
arXiv Detail & Related papers (2025-09-30T08:59:06Z) - Semantic Analysis of SNOMED CT Concept Co-occurrences in Clinical Documentation using MIMIC-IV [0.10499611180329803]
We investigate the relationship between SNOMED CT concept co-occurrence patterns and embedding-based semantic similarity.<n>Our analyses reveal that while co-occurrence and semantic similarity are weakly correlated, embeddings capture clinically meaningful associations.
arXiv Detail & Related papers (2025-09-03T19:25:14Z) - CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs [0.1578515540930834]
We introduce CLI-RAG (Clinically Informed Retrieval-Augmented Generation), a domain-specific framework for structured and clinically grounded text generation.<n>It incorporates a novel hierarchical chunking strategy that respects clinical document structure and introduces a task-specific dual-stage retrieval mechanism.<n>We apply the system to generate structured progress notes for individual hospital visits using 15 clinical note types from the MIMIC-III dataset.
arXiv Detail & Related papers (2025-07-09T10:13:38Z) - Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [83.30609465252441]
We propose a textbfCountertextbfFactual textbfExplanations-based framework (CoFE) for radiology report generation.
Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking what if'' scenarios.
Experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports.
arXiv Detail & Related papers (2024-07-19T17:24:25Z) - RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore)
RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions.
Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z) - Comparing Two Model Designs for Clinical Note Generation; Is an LLM a Useful Evaluator of Consistency? [3.019130210299794]
We analyze two approaches to generate different sections of a SOAP note based on the audio recording of the conversation.
We show that both methods lead to similar ROUGE values and have no difference in terms of the Factuality metric.
arXiv Detail & Related papers (2024-04-09T17:54:10Z) - IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training [15.04212780946932]
We propose a novel framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment.
The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report.
arXiv Detail & Related papers (2023-10-11T10:12:43Z) - Making the Most Out of the Limited Context Length: Predictive Power
Varies with Clinical Note Type and Note Section [70.37720062263176]
We propose a framework to analyze the sections with high predictive power.
Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large.
arXiv Detail & Related papers (2023-07-13T20:04:05Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - A Meta-Evaluation of Faithfulness Metrics for Long-Form Hospital-Course
Summarization [2.8575516056239576]
Long-form clinical summarization of hospital admissions has real-world significance because of its potential to help both clinicians and patients.
We benchmark faithfulness metrics against fine-grained human annotations for model-generated summaries of a patient's Brief Hospital Course.
arXiv Detail & Related papers (2023-03-07T14:57:06Z) - Classifying Cyber-Risky Clinical Notes by Employing Natural Language
Processing [9.77063694539068]
Recently, some states within the United States of America require patients to have open access to their clinical notes.
This research investigates methods for identifying security/privacy risks within clinical notes.
arXiv Detail & Related papers (2022-03-24T00:36:59Z) - Enriching Unsupervised User Embedding via Medical Concepts [51.17532619610099]
Unsupervised user embedding aims to encode patients into fixed-length vectors without human supervisions.
Medical concepts extracted from the clinical notes contain rich connections between patients and their clinical categories.
We propose a concept-aware unsupervised user embedding that jointly leverages text documents and medical concepts from two clinical corpora.
arXiv Detail & Related papers (2022-03-20T18:54:05Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - The Medical Scribe: Corpus Development and Model Performance Analyses [19.837396601641117]
Motivated by this goal, we developed an annotation scheme to extract relevant clinical concepts.
We used this annotation scheme to label a corpus of about 6k clinical encounters.
This was used to train a state-of-the-art tagging model.
arXiv Detail & Related papers (2020-03-12T03:10:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.