Annotating Social Determinants of Health Using Active Learning, and
Characterizing Determinants Using Neural Event Extraction
- URL: http://arxiv.org/abs/2004.05438v2
- Date: Wed, 2 Dec 2020 05:54:50 GMT
- Title: Annotating Social Determinants of Health Using Active Learning, and
Characterizing Determinants Using Neural Event Extraction
- Authors: Kevin Lybarger, Mari Ostendorf, Meliha Yetisgen
- Abstract summary: Social determinants of health (SDOH) affect health outcomes, and knowledge of SDOH can inform clinical decision-making.
This work presents a new corpus with SDOH annotations, a novel active learning framework, and the first extraction results on the new corpus.
- Score: 11.845850292404768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Social determinants of health (SDOH) affect health outcomes, and knowledge of
SDOH can inform clinical decision-making. Automatically extracting SDOH
information from clinical text requires data-driven information extraction
models trained on annotated corpora that are heterogeneous and frequently
include critical SDOH. This work presents a new corpus with SDOH annotations, a
novel active learning framework, and the first extraction results on the new
corpus. The Social History Annotation Corpus (SHAC) includes 4,480 social
history sections with detailed annotation for 12 SDOH characterizing the
status, extent, and temporal information of 18K distinct events. We introduce a
novel active learning framework that selects samples for annotation using a
surrogate text classification task as a proxy for a more complex event
extraction task. The active learning framework successfully increases the
frequency of health risk factors and improves automatic extraction of these
events over undirected annotation. An event extraction model trained on SHAC
achieves high extraction performance for substance use status (0.82-0.93 F1),
employment status (0.81-0.86 F1), and living status type (0.81-0.93 F1) on data
from three institutions.
Related papers
- Extracting Social Determinants of Health from Pediatric Patient Notes Using Large Language Models: Novel Corpus and Methods [17.83326146480516]
Social determinants of health (SDoH) play a critical role in shaping health outcomes.
We present a novel annotated corpus, the Pediatric Social History Corpus (PedSHAC)
We evaluate the automatic extraction of detailed SDoH representations using fine-tuned and in-context learning methods.
arXiv Detail & Related papers (2024-03-31T23:37:18Z) - Injecting linguistic knowledge into BERT for Dialogue State Tracking [60.42231674887294]
This paper proposes a method that extracts linguistic knowledge via an unsupervised framework.
We then utilize this knowledge to augment BERT's performance and interpretability in Dialogue State Tracking (DST) tasks.
We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.
arXiv Detail & Related papers (2023-11-27T08:38:42Z) - Prompt-based Extraction of Social Determinants of Health Using Few-shot
Learning [3.418600863629033]
Social determinants of health (SDOH) documented in the electronic health record are being studied to understand how SDOH impacts patient health outcomes.
In this work, we utilize the Social History Corpus (SHAC), a multi-institutional corpus of de-identified social history sections annotated for SDOH, including substance use, employment, and living status information.
We explore the automatic extraction of SDOH information with SHAC in both standoff and inline annotation formats using GPT-4 in a one-shot prompting setting.
Our prompt-based GPT-4 method achieved an overall 0.652 F1 on the SHAC test set,
arXiv Detail & Related papers (2023-06-12T15:08:25Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Development and validation of a natural language processing algorithm to
pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain.
We annotated a corpus of clinical documents according to 12 types of identifying entities.
We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z) - The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health [0.9023847175654602]
The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes.
This paper presents the shared task, data, participating teams, performance results, and considerations for future work.
arXiv Detail & Related papers (2023-01-13T14:20:23Z) - A Marker-based Neural Network System for Extracting Social Determinants
of Health [12.6970199179668]
Social determinants of health (SDoH) on patients' healthcare quality and the disparity is well-known.
Many SDoH items are not coded in structured forms in electronic health records.
We explore a multi-stage pipeline involving named entity recognition (NER), relation classification (RC), and text classification methods to extract SDoH information from clinical notes automatically.
arXiv Detail & Related papers (2022-12-24T18:40:23Z) - Leveraging Natural Language Processing to Augment Structured Social
Determinants of Health Data in the Electronic Health Record [1.7812428873698403]
Social determinants of health (SDOH) impact health outcomes.
Clinical notes often contain more comprehensive SDOH information.
We developed a novel SDOH extractor using a deep learning entity and relation extraction architecture.
arXiv Detail & Related papers (2022-12-14T22:51:49Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection.
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch.
Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z) - Confident Coreset for Active Learning in Medical Image Analysis [57.436224561482966]
We propose a novel active learning method, confident coreset, which considers both uncertainty and distribution for effectively selecting informative samples.
By comparative experiments on two medical image analysis tasks, we show that our method outperforms other active learning methods.
arXiv Detail & Related papers (2020-04-05T13:46:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.