Related papers: Dynamic feature selection in medical predictive monitoring by reinforcement learning

Dynamic feature selection in medical predictive monitoring by reinforcement learning

URL: http://arxiv.org/abs/2405.19729v1
Date: Thu, 30 May 2024 06:21:11 GMT
Title: Dynamic feature selection in medical predictive monitoring by reinforcement learning
Authors: Yutong Chen, Jiandong Gao, Ji Wu,
Abstract summary: Many existing feature selection methods fall short in effectively leveraging time-series information. We employ reinforcement learning to optimize a policy under maximum cost restrictions. Our method can seamlessly integrate with non-differentiable prediction models.
Score: 4.967941028121525
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we investigate dynamic feature selection within multivariate time-series scenario, a common occurrence in clinical prediction monitoring where each feature corresponds to a bio-test result. Many existing feature selection methods fall short in effectively leveraging time-series information, primarily because they are designed for static data. Our approach addresses this limitation by enabling the selection of time-varying feature subsets for each patient. Specifically, we employ reinforcement learning to optimize a policy under maximum cost restrictions. The prediction model is subsequently updated using synthetic data generated by trained policy. Our method can seamlessly integrate with non-differentiable prediction models. We conducted experiments on a sizable clinical dataset encompassing regression and classification tasks. The results demonstrate that our approach outperforms strong feature selection baselines, particularly when subjected to stringent cost limitations. Code will be released once paper is accepted.

Related papers

Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age [0.0]
This work proposes new optimization-based variable selection methods for multivariate, functional, and even more general outcomes in metrics spaces based on best-subset selection. Our framework applies to several types of regression models, including linear, quantile, or non parametric additive models, and to a broad range of random responses. Our analysis demonstrates that our proposed methodology outperforms state-of-the-art methods in accuracy and, especially, in speed-achieving several orders of magnitude improvement over competitors.
arXiv Detail & Related papers (2025-01-12T16:33:06Z)
Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection [89.42023974249122]
Adapt-$infty$ is a new multi-way and adaptive data selection approach for Lifelong Instruction Tuning. We construct pseudo-skill clusters by grouping gradient-based sample vectors. We select the best-performing data selector for each skill cluster from a pool of selector experts.
arXiv Detail & Related papers (2024-10-14T15:48:09Z)
Guiding adaptive shrinkage by co-data to improve regression-based prediction and feature selection [0.3867363075280544]
It is widely recognized that complementary data on the features, co-data', may improve results. Such co-data are ubiquitous in genomics settings due to the availability of public repositories. We review guided adaptive shrinkage methods: a class of regression-based learners that use co-data to adapt the shrinkage parameters.
arXiv Detail & Related papers (2024-05-08T09:38:11Z)
Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models [3.2988476179015005]
In this work we design a probabilistic unsupervised model for multiple arbitrary-length sequences contained in hospitalization Electronic Health Record (EHR) data. The model uses a latent variable structure and captures complex relationships between medications, diagnoses, laboratory tests, neurological assessments, and medications. Inference algorithms are derived that use partial data to infer properties of the complete sequences, including their length and presence of specific values.
arXiv Detail & Related papers (2024-03-27T21:06:26Z)
A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance. We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers. We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z)
Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points. To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples. We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z)
Flexible variable selection in the presence of missing data [0.0]
We propose a non-parametric variable selection algorithm combined with multiple imputation to develop flexible panels in the presence of missing-at-random data. We show that our proposal has good operating characteristics and results in panels with higher classification and variable selection performance.
arXiv Detail & Related papers (2022-02-25T21:41:03Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Dynamic Instance-Wise Classification in Correlated Feature Spaces [15.351282873821935]
In a typical machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. A new method is proposed that sequentially selects the best feature to evaluate for each test instance individually, and stops the selection process to make a prediction once it determines that no further improvement can be achieved with respect to classification accuracy. The effectiveness, generalizability, and scalability of the proposed method is illustrated on a variety of real-world datasets from diverse application domains.
arXiv Detail & Related papers (2021-06-08T20:20:36Z)
Causal Markov Boundaries [0.0]
We show how we can use observational data to improve feature selection and effect estimation. Our paper extends the notion of Markov boundary to treatment-outcome pairs.
arXiv Detail & Related papers (2021-03-12T22:49:10Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.