Towards Understanding the Survival of Patients with High-Grade
Gastroenteropancreatic Neuroendocrine Neoplasms: An Investigation of Ensemble
Feature Selection in the Prediction of Overall Survival
- URL: http://arxiv.org/abs/2302.10106v1
- Date: Mon, 20 Feb 2023 17:08:03 GMT
- Title: Towards Understanding the Survival of Patients with High-Grade
Gastroenteropancreatic Neuroendocrine Neoplasms: An Investigation of Ensemble
Feature Selection in the Prediction of Overall Survival
- Authors: Anna Jenul, Henning Langen Stokmo, Stefan Schrunner, Mona-Elisabeth
Revheim, Geir Olav Hjortland, Oliver Tomic
- Abstract summary: Ensemble feature selectors allow the user to identify such features in datasets with low sample sizes.
RENT and UBayFS are capable of integrating expert knowledge a priori in the feature selection process.
Our results demonstrate that both feature selectors allow accurate predictions, and that expert knowledge has a stabilizing effect on the feature set.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining the most informative features for predicting the overall survival
of patients diagnosed with high-grade gastroenteropancreatic neuroendocrine
neoplasms is crucial to improve individual treatment plans for patients, as
well as the biological understanding of the disease. Recently developed
ensemble feature selectors like the Repeated Elastic Net Technique for Feature
Selection (RENT) and the User-Guided Bayesian Framework for Feature Selection
(UBayFS) allow the user to identify such features in datasets with low sample
sizes. While RENT is purely data-driven, UBayFS is capable of integrating
expert knowledge a priori in the feature selection process. In this work we
compare both feature selectors on a dataset comprising of 63 patients and 134
features from multiple sources, including basic patient characteristics,
baseline blood values, tumor histology, imaging, and treatment information. Our
experiments involve data-driven and expert-driven setups, as well as
combinations of both. We use findings from clinical literature as a source of
expert knowledge. Our results demonstrate that both feature selectors allow
accurate predictions, and that expert knowledge has a stabilizing effect on the
feature set, while the impact on predictive performance is limited. The
features WHO Performance Status, Albumin, Platelets, Ki-67, Tumor Morphology,
Total MTV, Total TLG, and SUVmax are the most stable and predictive features in
our study.
Related papers
- Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Unleashing the Power of Extra-Tree Feature Selection and Random Forest
Classifier for Improved Survival Prediction in Heart Failure Patients [0.0]
Heart failure is a life-threatening condition that affects millions of people worldwide.
The ability to accurately predict patient survival can aid in early intervention and improve patient outcomes.
arXiv Detail & Related papers (2023-08-09T11:47:28Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm
Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm [0.15229257192293197]
Social Determinants of Health (SDoH) are collectively referred to as Social Determinants of Health (SDoH)
The majority of SDoH data is recorded in unstructured clinical notes by physicians and practitioners.
Our research focuses on extracting sentences from clinical notes to provide appropriate concepts.
arXiv Detail & Related papers (2023-02-08T01:11:59Z) - Ensemble feature selection with data-driven thresholding for Alzheimer's
disease biomarker discovery [0.0]
This work develops several data-driven thresholds to automatically identify the relevant features in an ensemble feature selector.
To demonstrate the applicability of these methods to clinical data, they are applied to data from two real-world Alzheimer's disease (AD) studies.
arXiv Detail & Related papers (2022-07-05T05:50:51Z) - Identifying Stroke Indicators Using Rough Sets [0.7340017786387767]
We propose a novel rough-set based technique for ranking the importance of the various EHR records in detecting stroke.
Age, average glucose level, heart disease, and hypertension were the most essential attributes for detecting stroke in patients.
arXiv Detail & Related papers (2021-10-19T06:04:48Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - RENT -- Repeated Elastic Net Technique for Feature Selection [0.46180371154032895]
We present the Repeated Elastic Net Technique (RENT) for Feature Selection.
RENT uses an ensemble of generalized linear models with elastic net regularization, each trained on distinct subsets of the training data.
RENT provides valuable information for model interpretation concerning the identification of objects in the data that are difficult to predict during training.
arXiv Detail & Related papers (2020-09-27T07:55:52Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.