Analysis of lifelog data using optimal feature selection based
unsupervised logistic regression (OFS-ULR) for chronic disease classification
- URL: http://arxiv.org/abs/2204.01281v1
- Date: Mon, 4 Apr 2022 07:11:26 GMT
- Title: Analysis of lifelog data using optimal feature selection based
unsupervised logistic regression (OFS-ULR) for chronic disease classification
- Authors: Sadhana Tiwari, Sonali Agarwal
- Abstract summary: Chronic disease classification models are now harnessing the potential of lifelog data to explore better healthcare practices.
This paper is to construct an optimal feature selection-based unsupervised logistic regression model (OFS-ULR) to classify chronic diseases.
- Score: 2.3909933791900326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancement in the field of pervasive healthcare monitoring systems
causes the generation of a huge amount of lifelog data in real-time. Chronic
diseases are one of the most serious health challenges in developing and
developed countries. According to WHO, this accounts for 73% of all deaths and
60% of the global burden of diseases. Chronic disease classification models are
now harnessing the potential of lifelog data to explore better healthcare
practices. This paper is to construct an optimal feature selection-based
unsupervised logistic regression model (OFS-ULR) to classify chronic diseases.
Since lifelog data analysis is crucial due to its sensitive nature; thus the
conventional classification models show limited performance. Therefore,
designing new classifiers for the classification of chronic diseases using
lifelog data is the need of the age. The vital part of building a good model
depends on pre-processing of the dataset, identifying important features, and
then training a learning algorithm with suitable hyper parameters for better
performance. The proposed approach improves the performance of existing methods
using a series of steps such as (i) removing redundant or invalid instances,
(ii) making the data labelled using clustering and partitioning the data into
classes, (iii) identifying the suitable subset of features by applying either
some domain knowledge or selection algorithm, (iv) hyper parameter tuning for
models to get best results, and (v) performance evaluation using Spark
streaming environment. For this purpose, two-time series datasets are used in
the experiment to compute the accuracy, recall, precision, and f1-score. The
experimental analysis proves the suitability of the proposed approach as
compared to the conventional classifiers and our newly constructed model
achieved highest accuracy and reduced training complexity among all among all.
Related papers
- Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - TRIAGE: Characterizing and auditing training data for improved
regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors.
TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score.
We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z) - An Evaluation of Machine Learning Approaches for Early Diagnosis of
Autism Spectrum Disorder [0.0]
Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities.
This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process.
arXiv Detail & Related papers (2023-09-20T21:23:37Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Robust self-healing prediction model for high dimensional data [0.685316573653194]
This work proposes a robust self healing (RSH) hybrid prediction model.
It functions by using the data in its entirety by removing errors and inconsistencies from it rather than discarding any data.
The proposed method is compared with some of the existing high performing models and the results are analyzed.
arXiv Detail & Related papers (2022-10-04T17:55:50Z) - Categorical EHR Imputation with Generative Adversarial Nets [11.171712535005357]
We propose a simple and yet effective approach that is based on previous work on GANs for data imputation.
We show that our imputation approach largely improves the prediction accuracy, compared to more traditional data imputation approaches.
arXiv Detail & Related papers (2021-08-03T18:50:26Z) - An Explainable Classification Model for Chronic Kidney Disease Patients [0.0]
Chronic Kidney Disease (CKD) is experiencing a globally increasing incidence and high cost to health systems.
The employment of data mining to discover subtle patterns in CKD indicators would contribute to an early diagnosis.
This work develops a classifier model that would support healthcare professionals in the early diagnosis of CKD patients.
arXiv Detail & Related papers (2021-05-21T14:09:43Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.