KU AIGEN ICL EDI@BC8 Track 3: Advancing Phenotype Named Entity Recognition and Normalization for Dysmorphology Physical Examination Reports
- URL: http://arxiv.org/abs/2501.09744v1
- Date: Thu, 16 Jan 2025 18:53:32 GMT
- Title: KU AIGEN ICL EDI@BC8 Track 3: Advancing Phenotype Named Entity Recognition and Normalization for Dysmorphology Physical Examination Reports
- Authors: Hajung Kim, Chanhwi Kim, Jiwoong Sohn, Tim Beck, Marek Rei, Sunkyu Kim, T Ian Simpson, Joram M Posma, Antoine Lain, Mujeen Sung, Jaewoo Kang,
- Abstract summary: The objective of BioCreative8 Track 3 is to extract phenotypic key medical findings embedded within EHR texts and normalize these findings to Human Phenotype Ontology terms.<n>The presence of diverse surface forms in phenotypic findings makes it challenging to accurately normalize them to the correct HPO terms.<n>Our pipeline resulted in an exact extraction and normalization F1 score 2.6% higher than the mean score of all submissions received in response to the challenge.
- Score: 20.19611327520341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The objective of BioCreative8 Track 3 is to extract phenotypic key medical findings embedded within EHR texts and subsequently normalize these findings to their Human Phenotype Ontology (HPO) terms. However, the presence of diverse surface forms in phenotypic findings makes it challenging to accurately normalize them to the correct HPO terms. To address this challenge, we explored various models for named entity recognition and implemented data augmentation techniques such as synonym marginalization to enhance the normalization step. Our pipeline resulted in an exact extraction and normalization F1 score 2.6\% higher than the mean score of all submissions received in response to the challenge. Furthermore, in terms of the normalization F1 score, our approach surpassed the average performance by 1.9\%. These findings contribute to the advancement of automated medical data extraction and normalization techniques, showcasing potential pathways for future research and application in the biomedical domain.
Related papers
- EVA: Towards a universal model of the immune system [0.18149976637753015]
We introduce EVA, the first cross-species, multimodal foundation model of immunology and inflammation.<n> EVA harmonizes transcriptomics data across species, platforms, and resolutions, and integrates histology data to produce rich, unified patient representations.<n>We introduce a comprehensive evaluation suite of 39 tasks spanning the drug development pipeline.
arXiv Detail & Related papers (2026-02-10T13:51:09Z) - Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - Enhancing Biomedical Named Entity Recognition using GLiNER-BioMed with Targeted Dictionary-Based Post-processing for BioASQ 2025 task 6 [0.0]
This study evaluates the GLiNER-BioMed model on a BioASQ dataset.<n>We introduce a targeted dictionary-based post-processing strategy to address common misclassifications.<n>This work highlights the potential of dictionary-based refinement for pre-trained BioNER models but underscores the critical challenge of overfitting to development data.
arXiv Detail & Related papers (2025-10-03T18:35:04Z) - MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z) - AdaFusion: Prompt-Guided Inference with Adaptive Fusion of Pathology Foundation Models [49.550545038402184]
We propose AdaFusion, a novel prompt-guided inference framework.<n>Our method compresses and aligns tile-level features from diverse models.<n>AdaFusion consistently surpasses individual PFMs across both classification and regression tasks.
arXiv Detail & Related papers (2025-08-07T07:09:31Z) - MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection [30.77558600436759]
Anomaly detection is a crucial task in computer vision, yet collecting real-world defect images is inherently difficult.
We introduce a novel pipeline that generates synthetic anomalies through Math-Physics model guidance.
By incorporating physical modeling of cracks, corrosion, and deformation, our method produces realistic defect masks.
arXiv Detail & Related papers (2025-04-17T14:22:27Z) - PO3AD: Predicting Point Offsets toward Better 3D Point Cloud Anomaly Detection [26.125202166476875]
Point cloud anomaly detection under the anomaly-free setting poses significant challenges.<n>We introduce an innovative approach that emphasizes learning point offsets, targeting more informative pseudo-abnormal points.<n>Our proposed method outperforms existing state-of-the-art approaches, achieving an average enhancement of 9.0% and 1.4% in the AUC-ROC detection metric.
arXiv Detail & Related papers (2024-12-17T07:30:09Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Deep learning in computed tomography pulmonary angiography imaging: a
dual-pronged approach for pulmonary embolism detection [0.0]
The aim of this study is to leverage deep learning techniques to enhance the Computer Assisted Diagnosis (CAD) of Pulmonary Embolism (PE)
Our classification system includes an Attention-Guided Convolutional Neural Network (AG-CNN) that uses local context by employing an attention mechanism.
AG-CNN achieves robust performance on the FUMPE dataset, achieving an AUROC of 0.927, sensitivity of 0.862, specificity of 0.879, and an F1-score of 0.805 with the Inception-v3 backbone architecture.
arXiv Detail & Related papers (2023-11-09T08:23:44Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - An evaluation of GPT models for phenotype concept recognition [0.4715973318447338]
We examine the performance of the latest Generative Pre-trained Transformer (GPT) models for clinical phenotyping and phenotype annotation.
Our results show that, with an appropriate setup, these models can achieve state of the art performance.
arXiv Detail & Related papers (2023-09-29T12:06:55Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Achieving state-of-the-art performance in the Medical
Out-of-Distribution (MOOD) challenge using plausible synthetic anomalies [0.5677301320664404]
Unsupervised anomaly detection, or Out-of-Distribution detection, aims at identifying anomalous samples.
Our method builds upon the self-supervised strategy consisting on training a segmentation network to identify local synthetic anomalies.
Our contributions improve the synthetic anomaly generation process, making synthetic anomalies more heterogeneous.
arXiv Detail & Related papers (2023-08-02T20:16:13Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Generative models improve fairness of medical classifiers under
distribution shifts [49.10233060774818]
We show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.
We demonstrate that these learned augmentations can surpass ones by making models more robust and statistically fair in- and out-of-distribution.
arXiv Detail & Related papers (2023-04-18T18:15:38Z) - Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions [0.6875312133832078]
We provide a comprehensive review of low-rank approximation-based approaches for computational phenotyping.
Recent developments have adapted low-rank data approximation methods by incorporating different constraints and regularizations that facilitate interpretability further.
arXiv Detail & Related papers (2022-09-01T09:47:27Z) - Improved Drug-target Interaction Prediction with Intermolecular Graph
Transformer [98.8319016075089]
We propose a novel approach to model intermolecular information with a three-way Transformer-based architecture.
Intermolecular Graph Transformer (IGT) outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively.
IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses.
arXiv Detail & Related papers (2021-10-14T13:28:02Z) - Interpreting Deep Learning Models for Epileptic Seizure Detection on EEG
signals [4.748221780751802]
Deep Learning (DL) is often considered the state-of-the art for Artificial Intelligence-based medical decision support.
It remains sparsely implemented in clinical practice and poorly trusted by clinicians due to insufficient interpretability of neural network models.
We have tackled this issue by developing interpretable DL models in the context of online detection of epileptic seizure, based on EEG signal.
arXiv Detail & Related papers (2020-12-22T11:10:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.