The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health
- URL: http://arxiv.org/abs/2301.05571v1
- Date: Fri, 13 Jan 2023 14:20:23 GMT
- Title: The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health
- Authors: Kevin Lybarger, Meliha Yetisgen, \"Ozlem Uzuner
- Abstract summary: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes.
This paper presents the shared task, data, participating teams, performance results, and considerations for future work.
- Score: 0.9023847175654602
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective: The n2c2/UW SDOH Challenge explores the extraction of social
determinant of health (SDOH) information from clinical notes. The objectives
include the advancement of natural language processing (NLP) information
extraction techniques for SDOH and clinical information more broadly. This
paper presents the shared task, data, participating teams, performance results,
and considerations for future work.
Materials and Methods: The task used the Social History Annotated Corpus
(SHAC), which consists of clinical text with detailed event-based annotations
for SDOH events such as alcohol, drug, tobacco, employment, and living
situation. Each SDOH event is characterized through attributes related to
status, extent, and temporality. The task includes three subtasks related to
information extraction (Subtask A), generalizability (Subtask B), and learning
transfer (Subtask C). In addressing this task, participants utilized a range of
techniques, including rules, knowledge bases, n-grams, word embeddings, and
pretrained language models (LM).
Results: A total of 15 teams participated, and the top teams utilized
pretrained deep learning LM. The top team across all subtasks used a
sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1
Subtask B, and 0.889 F1 for Subtask C.
Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the
best performance, including generalizability and learning transfer. An error
analysis indicates extraction performance varies by SDOH, with lower
performance achieved for conditions, like substance use and homelessness, that
increase health risks (risk factors) and higher performance achieved for
conditions, like substance abstinence and living with family, that reduce
health risks (protective factors).
Related papers
- D-NLP at SemEval-2024 Task 2: Evaluating Clinical Inference Capabilities of Large Language Models [5.439020425819001]
Large language models (LLMs) have garnered significant attention and widespread usage due to their impressive performance in various tasks.
However, they are not without their own set of challenges, including issues such as hallucinations, factual inconsistencies, and limitations in numerical-quantitative reasoning.
arXiv Detail & Related papers (2024-05-07T10:11:14Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Task formulation for Extracting Social Determinants of Health from
Clinical Narratives [0.0]
The 2022 n2c2 NLP Challenge posed identification of social determinants of health in clinical narratives.
We present three systems that we developed for the Challenge and discuss the distinctive task formulation used in each of the three systems.
arXiv Detail & Related papers (2023-01-26T20:00:54Z) - "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint
Learning [74.14961250042629]
Multi-Task Learning (MTL) promises attractive, characterizing the conditions of its success is still an open problem in Deep Learning.
Estimateing task affinity for joint learning is a key endeavor.
Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL.
Yet, the literature is lacking a benchmark to assess the effectiveness of tasks affinity estimation techniques.
arXiv Detail & Related papers (2023-01-07T15:16:35Z) - A Marker-based Neural Network System for Extracting Social Determinants
of Health [12.6970199179668]
Social determinants of health (SDoH) on patients' healthcare quality and the disparity is well-known.
Many SDoH items are not coded in structured forms in electronic health records.
We explore a multi-stage pipeline involving named entity recognition (NER), relation classification (RC), and text classification methods to extract SDoH information from clinical notes automatically.
arXiv Detail & Related papers (2022-12-24T18:40:23Z) - KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in
Few-Shot NLP [68.43279384561352]
Existing data augmentation algorithms leverage task-independent rules or fine-tune general-purpose pre-trained language models.
These methods have trivial task-specific knowledge and are limited to yielding low-quality synthetic data for weak baselines in simple tasks.
We propose the Knowledge Mixture Data Augmentation Model (KnowDA): an encoder-decoder LM pretrained on a mixture of diverse NLP tasks.
arXiv Detail & Related papers (2022-06-21T11:34:02Z) - Zero-Shot Information Extraction as a Unified Text-to-Triple Translation [56.01830747416606]
We cast a suite of information extraction tasks into a text-to-triple translation framework.
We formalize the task as a translation between task-specific input text and output triples.
We study the zero-shot performance of this framework on open information extraction.
arXiv Detail & Related papers (2021-09-23T06:54:19Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - Annotating Social Determinants of Health Using Active Learning, and
Characterizing Determinants Using Neural Event Extraction [11.845850292404768]
Social determinants of health (SDOH) affect health outcomes, and knowledge of SDOH can inform clinical decision-making.
This work presents a new corpus with SDOH annotations, a novel active learning framework, and the first extraction results on the new corpus.
arXiv Detail & Related papers (2020-04-11T16:19:02Z) - oLMpics -- On what Language Model Pre-training Captures [84.60594612120173]
We propose eight reasoning tasks, which require operations such as comparison, conjunction, and composition.
A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.
arXiv Detail & Related papers (2019-12-31T12:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.