Heterogeneous Treatment Effect Estimation using machine learning for
Healthcare application: tutorial and benchmark
- URL: http://arxiv.org/abs/2109.12769v1
- Date: Mon, 27 Sep 2021 02:34:44 GMT
- Title: Heterogeneous Treatment Effect Estimation using machine learning for
Healthcare application: tutorial and benchmark
- Authors: Yaobin Ling, Pulakesh Upadhyaya, Luyao Chen, Xiaoqian Jiang, Yejin Kim
- Abstract summary: Many studies have shown that drugs effects are heterogeneous among the population.
Lots of advanced machine learning models about estimating heterogeneous treatment effects (HTE) have emerged in recent years.
We aim to introduce the HTE methodology to the healthcare area and provide feasibility consideration.
- Score: 8.869515663374248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing new drugs for target diseases is a time-consuming and expensive
task, drug repurposing has become a popular topic in the drug development
field. As much health claim data become available, many studies have been
conducted on the data. The real-world data is noisy, sparse, and has many
confounding factors. In addition, many studies have shown that drugs effects
are heterogeneous among the population. Lots of advanced machine learning
models about estimating heterogeneous treatment effects (HTE) have emerged in
recent years, and have been applied to in econometrics and machine learning
communities. These studies acknowledge medicine and drug development as the
main application area, but there has been limited translational research from
the HTE methodology to drug development. We aim to introduce the HTE
methodology to the healthcare area and provide feasibility consideration when
translating the methodology with benchmark experiments on healthcare
administrative claim data. Also, we want to use benchmark experiments to show
how to interpret and evaluate the model when it is applied to healthcare
research. By introducing the recent HTE techniques to a broad readership in
biomedical informatics communities, we expect to promote the wide adoption of
causal inference using machine learning. We also expect to provide the
feasibility of HTE for personalized drug effectiveness.
Related papers
- Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media [6.138126219622993]
Substance use disorders (SUDs) are a growing concern globally, necessitating enhanced understanding of the problem and its trends through data-driven research.
Social media are unique and important sources of information about SUDs, particularly since the data in such sources are often generated by people with lived experiences.
In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder.
The dataset specifically concentrates on the lesser-studied, yet critically important, aspects of substance use--its
arXiv Detail & Related papers (2024-05-09T23:43:57Z) - Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data.
We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously.
Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z) - "Hey..! This medicine made me sick": Sentiment Analysis of User-Generated Drug Reviews using Machine Learning Techniques [2.2874754079405535]
This project proposes a drug review classification system that classifies user reviews on a particular drug into different classes, such as positive, negative, and neutral.
The collected data is manually labeled and verified manually to ensure that the labels are correct.
arXiv Detail & Related papers (2024-04-09T08:42:34Z) - Learning a Patent-Informed Biomedical Knowledge Graph Reveals Technological Potential of Drug Repositioning Candidates [6.268435617836703]
This study presents a novel protocol to analyse various sources such as pharmaceutical patents and biomedical databases.
We identify drug repositioning candidates with both technological potential and scientific evidence.
Our case study on Alzheimer's disease demonstrates its efficacy and feasibility.
arXiv Detail & Related papers (2023-09-04T02:30:19Z) - A clustering and graph deep learning-based framework for COVID-19 drug
repurposing [0.3359875577705538]
This study presents a novel unsupervised machine learning framework that utilizes a graph-based autoencoder for multi-feature type clustering on heterogeneous drug data.
The dataset consists of 438 drugs, of which 224 are under clinical trials for COVID-19.
Our framework relies on reported drug data, including its pharmacological properties, chemical/physical properties, interaction with the host, and efficacy in different publicly available COVID-19 assays.
arXiv Detail & Related papers (2023-06-24T15:00:47Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug
Discovery [79.08833067391093]
Real-world pharmaceutical datasets often exhibit highly imbalanced distribution.
We introduce ImDrug, a benchmark with an open-source Python library which consists of 4 imbalance settings, 11 AI-ready datasets, 54 learning tasks and 16 baseline algorithms tailored for imbalanced learning.
It provides an accessible and customizable testbed for problems and solutions spanning a broad spectrum of the drug discovery pipeline.
arXiv Detail & Related papers (2022-09-16T13:35:57Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Deep learning for drug repurposing: methods, databases, and applications [54.08583498324774]
Repurposing existing drugs for new therapies is an attractive solution that accelerates drug development at reduced experimental costs.
In this review, we introduce guidelines on how to utilize deep learning methodologies and tools for drug repurposing.
arXiv Detail & Related papers (2022-02-08T09:42:08Z) - Machine Learning Applications for Therapeutic Tasks with Genomics Data [49.98249191161107]
We review the literature on machine learning applications for genomics through the lens of therapeutic development.
We identify twenty-two machine learning in genomics applications across the entire therapeutics pipeline.
We pinpoint seven important challenges in this field with opportunities for expansion and impact.
arXiv Detail & Related papers (2021-05-03T21:20:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.