Unsupervised Discovery of Clinical Disease Signatures Using
Probabilistic Independence
- URL: http://arxiv.org/abs/2402.05802v1
- Date: Thu, 8 Feb 2024 16:41:03 GMT
- Title: Unsupervised Discovery of Clinical Disease Signatures Using
Probabilistic Independence
- Authors: Thomas A. Lasko, John M. Still, Thomas Z. Li, Marco Barbero Mota,
William W. Stead, Eric V. Strobl, Bennett A. Landman, Fabien Maldonado
- Abstract summary: Insufficiently precise diagnosis of clinical disease is likely responsible for many treatment failures.
We present an approach to learning these patterns by using probabilistic independence to disentangle the imprint on the medical record of causal latent sources of disease.
- Score: 8.52372042610759
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Insufficiently precise diagnosis of clinical disease is likely responsible
for many treatment failures, even for common conditions and treatments. With a
large enough dataset, it may be possible to use unsupervised machine learning
to define clinical disease patterns more precisely. We present an approach to
learning these patterns by using probabilistic independence to disentangle the
imprint on the medical record of causal latent sources of disease. We inferred
a broad set of 2000 clinical signatures of latent sources from 9195 variables
in 269,099 Electronic Health Records. The learned signatures produced better
discrimination than the original variables in a lung cancer prediction task
unknown to the inference algorithm, predicting 3-year malignancy in patients
with no history of cancer before a solitary lung nodule was discovered. More
importantly, the signatures' greater explanatory power identified pre-nodule
signatures of apparently undiagnosed cancer in many of those patients.
Related papers
- RareAlert: Aligning heterogeneous large language model reasoning for early rare disease risk screening [19.93227904357489]
We present RareAlert, an early screening system which predict patient-level rare disease risk from routinely available primary-visit information.<n>RareAlert integrates reasoning generated by ten LLMs, calibrates and weights these signals using machine learning, and distils the aligned reasoning into a single locally deployable model.
arXiv Detail & Related papers (2026-01-26T04:27:16Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - A Weakly Supervised Transformer to Support Rare Disease Diagnosis from Electronic Health Records: Methods and Applications in Rare Pulmonary Disease [16.112294460618955]
Rare diseases affect an estimated 300-400 million people worldwide.<n> computational phenotyping algorithms show promise for rare disease detection.<n>We propose a weakly supervised, transformer-based framework that combines a small set of gold-standard labels with a large volume of iteratively updated silver-standard labels.
arXiv Detail & Related papers (2025-07-01T23:11:20Z) - An Agentic System for Rare Disease Diagnosis with Traceable Reasoning [58.78045864541539]
We introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM)<n>DeepRare generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning.<n>The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases.
arXiv Detail & Related papers (2025-06-25T13:42:26Z) - Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models [17.739985240125733]
We present CATCH-FM, CATch Cancer early with Healthcare Foundation Models.<n>Caught-FM identifies high-risk patients for further screening based on historical medical records.<n>Our analysis demonstrates the robustness of CATCH-FM in various patient distributions.
arXiv Detail & Related papers (2025-05-30T20:31:09Z) - ColonScopeX: Leveraging Explainable Expert Systems with Multimodal Data for Improved Early Diagnosis of Colorectal Cancer [3.541280502270993]
Colorectal cancer (CRC) ranks as the second leading cause of cancer-related deaths and the third most prevalent malignant tumour worldwide.
Early detection of CRC remains problematic due to its non-specific and often embarrassing symptoms.
We propose ColonScopeX, a machine learning framework utilizing explainable AI (XAI) methodologies to enhance the early detection of CRC and pre-cancerous lesions.
arXiv Detail & Related papers (2025-04-09T20:45:11Z) - A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis [58.85247337449624]
We propose a knowledge-enhanced vision-language pre-training approach that integrates disease knowledge into the alignment within hierarchical semantic groups.
KEEP achieves state-of-the-art performance in zero-shot cancer diagnostic tasks.
arXiv Detail & Related papers (2024-12-17T17:45:21Z) - A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation [12.617587827105496]
This research aims to bridge the gap by providing publicly accessible datasets and reliable tools for medical diagnosis.
We curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients.
These promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
arXiv Detail & Related papers (2024-06-26T06:39:11Z) - Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - Expert Uncertainty and Severity Aware Chest X-Ray Classification by
Multi-Relationship Graph Learning [48.29204631769816]
We re-extract disease labels from CXR reports to make them more realistic by considering disease severity and uncertainty in classification.
Our experimental results show that models considering disease severity and uncertainty outperform previous state-of-the-art methods.
arXiv Detail & Related papers (2023-09-06T19:19:41Z) - Diagnosis Uncertain Models For Medical Risk Prediction [80.07192791931533]
We consider a patient risk model which has access to vital signs, lab values, and prior history but does not have access to a patient's diagnosis.
We show that such all-cause' risk models have good generalization across diagnoses but have a predictable failure mode.
We propose a fix for this problem by explicitly modeling the uncertainty in risk prediction coming from uncertainty in patient diagnoses.
arXiv Detail & Related papers (2023-06-29T23:36:04Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Analyzing historical diagnosis code data from NIH N3C and RECOVER
Programs using deep learning to determine risk factors for Long Covid [0.5058404769410755]
Post-acute sequelae of SARS-CoV-2 infection (PASC) or Long COVID is an emerging medical condition.
We propose an interpretable deep learning approach to analyze historical diagnosis code data from the National COVID Cohort Collective.
arXiv Detail & Related papers (2022-10-05T18:10:01Z) - Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule
Diagnosis [20.510918720980467]
Lung cancer has the highest mortality rate of deadly cancers in the world.
Computer-aided diagnosis (CAD) systems have been developed to assist radiologists in nodule detection and diagnosis.
Lack of model reliability and interpretability remains a major obstacle for its large-scale clinical application.
arXiv Detail & Related papers (2022-04-08T08:21:00Z) - Intelligent Sight and Sound: A Chronic Cancer Pain Dataset [74.77784420691937]
This paper introduces the first chronic cancer pain dataset, collected as part of the Intelligent Sight and Sound (ISS) clinical trial.
The data collected to date consists of 29 patients, 509 smartphone videos, 189,999 frames, and self-reported affective and activity pain scores.
Using static images and multi-modal data to predict self-reported pain levels, early models show significant gaps between current methods available to predict pain.
arXiv Detail & Related papers (2022-04-07T22:14:37Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z) - Handling uncertainty using features from pathology: opportunities in
primary care data for developing high risk cancer survival methods [0.10499611180329804]
More than 144 000 Australians were diagnosed with cancer in 2019.
The majority will first present to their GP symptomatically, even for cancer for which screening programs exist.
We investigate how past pathology test results can lead to deriving features that can be used to predict cancer outcomes.
arXiv Detail & Related papers (2020-12-17T23:27:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.