Learning Curves for Drug Response Prediction in Cancer Cell Lines
- URL: http://arxiv.org/abs/2011.12466v1
- Date: Wed, 25 Nov 2020 01:08:05 GMT
- Title: Learning Curves for Drug Response Prediction in Cancer Cell Lines
- Authors: Alexander Partin (1 and 2), Thomas Brettin (2 and 3), Yvonne A. Evrard
(4), Yitan Zhu (1 and 2), Hyunseung Yoo (1 and 2), Fangfang Xia (1 and 2),
Songhao Jiang (7), Austin Clyde (1 and 7), Maulik Shukla (1 and 2), Michael
Fonstein (5), James H. Doroshow (6), Rick Stevens (3 and 7) ((1) Division of
Data Science and Learning, Argonne National Laboratory, Argonne, IL, USA, (2)
University of Chicago Consortium for Advanced Science and Engineering,
University of Chicago, Chicago, IL, USA, (3) Computing, Environment and Life
Sciences, Argonne National Laboratory, Lemont, IL, USA, (4) Frederick
National Laboratory for Cancer Research, Leidos Biomedical Research, Inc.
Frederick, MD, USA, (5) Biosciences Division, Argonne National Laboratory,
Lemont, IL, USA, (6) Division of Cancer Therapeutics and Diagnosis, National
Cancer Institute, Bethesda, MD, USA, (7) Department of Computer Science, The
University of Chicago, Chicago, IL, USA)
- Abstract summary: We evaluate the data scaling properties of two neural networks (NNs) and two gradient boosting decision tree (GBDT) models trained on four drug screening datasets.
The learning curves are accurately fitted to a power law model, providing a framework for assessing the data scaling behavior of these predictors.
- Score: 29.107984441845673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the size of cell line drug sensitivity data, researchers have
been developing machine learning (ML) models for predicting drug response to
advance cancer treatment. As drug sensitivity studies continue generating data,
a common question is whether the proposed predictors can further improve the
generalization performance with more training data. We utilize empirical
learning curves for evaluating and comparing the data scaling properties of two
neural networks (NNs) and two gradient boosting decision tree (GBDT) models
trained on four drug screening datasets. The learning curves are accurately
fitted to a power law model, providing a framework for assessing the data
scaling behavior of these predictors. The curves demonstrate that no single
model dominates in terms of prediction performance across all datasets and
training sizes, suggesting that the shape of these curves depends on the unique
model-dataset pair. The multi-input NN (mNN), in which gene expressions and
molecular drug descriptors are input into separate subnetworks, outperforms a
single-input NN (sNN), where the cell and drug features are concatenated for
the input layer. In contrast, a GBDT with hyperparameter tuning exhibits
superior performance as compared with both NNs at the lower range of training
sizes for two of the datasets, whereas the mNN performs better at the higher
range of training sizes. Moreover, the trajectory of the curves suggests that
increasing the sample size is expected to further improve prediction scores of
both NNs. These observations demonstrate the benefit of using learning curves
to evaluate predictors, providing a broader perspective on the overall data
scaling characteristics. The fitted power law curves provide a forward-looking
performance metric and can serve as a co-design tool to guide experimental
biologists and computational scientists in the design of future experiments.
Related papers
- CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM)
CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models.
We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z) - Graph-enabled Reinforcement Learning for Time Series Forecasting with
Adaptive Intelligence [11.249626785206003]
We propose a novel approach for predicting time-series data using Graphical neural network (GNN) and monitoring with Reinforcement Learning (RL)
GNNs are able to explicitly incorporate the graph structure of the data into the model, allowing them to capture temporal dependencies in a more natural way.
This approach allows for more accurate predictions in complex temporal structures, such as those found in healthcare, traffic and weather forecasting.
arXiv Detail & Related papers (2023-09-18T22:25:12Z) - Transferability of coVariance Neural Networks and Application to
Interpretable Brain Age Prediction using Anatomical Features [119.45320143101381]
Graph convolutional networks (GCN) leverage topology-driven graph convolutional operations to combine information across the graph for inference tasks.
We have studied GCNs with covariance matrices as graphs in the form of coVariance neural networks (VNNs)
VNNs inherit the scale-free data processing architecture from GCNs and here, we show that VNNs exhibit transferability of performance over datasets whose covariance matrices converge to a limit object.
arXiv Detail & Related papers (2023-05-02T22:15:54Z) - Pipeline-Invariant Representation Learning for Neuroimaging [5.502218439301424]
We evaluate how preprocessing pipeline selection can impact the downstream performance of a supervised learning model.
We propose two pipeline-invariant representation learning methodologies, MPSL and PXL, to improve robustness in classification performance.
These results suggest that our proposed models can be applied to mitigate pipeline-related biases, and to improve prediction robustness in brain-phenotype modeling.
arXiv Detail & Related papers (2022-08-27T02:34:44Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - A robust low data solution: dimension prediction of semiconductor
nanorods [5.389015968413988]
Robust deep neural network-based regression algorithm has been developed for precise prediction of length, width, and aspect ratios of semiconductor nanorods (NRs)
Deep neural network is further applied to develop regression model which demonstrated the well performed prediction on both the original and generated data with a similar distribution.
arXiv Detail & Related papers (2020-10-27T07:51:38Z) - Bidirectional Representation Learning from Transformers using Multimodal
Electronic Health Record Data to Predict Depression [11.1492931066686]
We present a temporal deep learning model to perform bidirectional representation learning on EHR sequences to predict depression.
The model generated the highest increases of precision-recall area under the curve (PRAUC) from 0.70 to 0.76 in depression prediction compared to the best baseline model.
arXiv Detail & Related papers (2020-09-26T17:56:37Z) - Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug
Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response.
We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset.
The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.