Related papers: Comprehensive and user-analytics-friendly cancer patient database for physicians and researchers

Comprehensive and user-analytics-friendly cancer patient database for physicians and researchers

URL: http://arxiv.org/abs/2302.01337v1
Date: Wed, 1 Feb 2023 20:10:06 GMT
Title: Comprehensive and user-analytics-friendly cancer patient database for physicians and researchers
Authors: Ali Firooz, Avery T. Funkhouser, Julie C. Martin, W. Jeffery Edenfield, Homayoun Valafar, and Anna V. Blenda
Abstract summary: A relational database has been developed integrating status of cancer-critical gene mutations, serum galectin profiles, serum and tumor glycomic profiles. Our project provides a framework for an integrated, interactive, and growing database to analyze molecular and clinical patterns across cancer stages and subtypes.
Score: 0.18472148461613155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nuanced cancer patient care is needed, as the development and clinical course of cancer is multifactorial with influences from the general health status of the patient, germline and neoplastic mutations, co-morbidities, and environment. To effectively tailor an individualized treatment to each patient, such multifactorial data must be presented to providers in an easy-to-access and easy-to-analyze fashion. To address the need, a relational database has been developed integrating status of cancer-critical gene mutations, serum galectin profiles, serum and tumor glycomic profiles, with clinical, demographic, and lifestyle data points of individual cancer patients. The database, as a backend, provides physicians and researchers with a single, easily accessible repository of cancer profiling data to aid-in and enhance individualized treatment. Our interactive database allows care providers to amalgamate cohorts from these groups to find correlations between different data types with the possibility of finding "molecular signatures" based upon a combination of genetic mutations, galectin serum levels, glycan compositions, and patient clinical data and lifestyle choices. Our project provides a framework for an integrated, interactive, and growing database to analyze molecular and clinical patterns across cancer stages and subtypes and provides opportunities for increased diagnostic and prognostic power.

Related papers

Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z)
Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates. Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information. Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals. Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z)
Multivariate Feature Selection and Autoencoder Embeddings of Ovarian Cancer Clinical and Genetic Data [2.973561339858947]
This study explores a data-driven approach to discovering novel clinical and genetic markers in ovarian cancer (OC) In the autoencoder analysis, a clearer pattern emerged when using clinical features and the combination of clinical and genetic data. Key clinical variables (such as type of surgery and neoadjuvant chemotherapy) and certain gene mutations showed strong relevance, along with low-risk genetic factors.
arXiv Detail & Related papers (2025-01-27T09:07:07Z)
Multi-Omic and Quantum Machine Learning Integration for Lung Subtypes Classification [0.0]
The fusion of quantum computing and machine learning holds promise for unraveling complex patterns within multi-omics datasets. We developed a method for finding the best differentiating features between LUAD and LUSC datasets, which has the potential for biomarker discovery.
arXiv Detail & Related papers (2024-10-02T23:16:31Z)
Towards AI-Based Precision Oncology: A Machine Learning Framework for Personalized Counterfactual Treatment Suggestions based on Multi-Omics Data [0.05025737475817938]
We propose a modular machine learning framework designed for personalized counterfactual cancer treatment suggestions. The framework is tailored to address critical challenges inherent in data-driven cancer research. Our method aims to empower clinicians with a reality-centric decision-support tool.
arXiv Detail & Related papers (2024-02-19T14:54:20Z)
Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions [3.5489676012585236]
We introduce the Bridge model to derive integrated features to preserve information beyond common genes. The model consistently excels in predicting patient survival across six cancer types in GENIE BPC data.
arXiv Detail & Related papers (2024-01-30T23:25:05Z)
Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data [75.77035221531261]
Cancer-Net PCa-Data is an open-source benchmark dataset of volumetric CDI$s$ imaging data of PCa patients. Cancer-Net PCa-Data is the first-ever public dataset of CDI$s$ imaging data for PCa.
arXiv Detail & Related papers (2023-11-20T10:28:52Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
A Personalized Diagnostic Generation Framework Based on Multi-source Heterogeneous Data [8.115713756776119]
We propose a framework that combines pathological images and medical reports to generate a personalized diagnosis result for individual patient. We use nuclei-level image feature similarity and content-based deep learning method to search for a personalized group of population with similar pathological characteristics.
arXiv Detail & Related papers (2021-10-26T13:12:52Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Trajectories, bifurcations and pseudotime in large clinical datasets: applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values. The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.