Semi-Supervised Learning for Deep Causal Generative Models
- URL: http://arxiv.org/abs/2403.18717v2
- Date: Fri, 12 Jul 2024 14:13:41 GMT
- Title: Semi-Supervised Learning for Deep Causal Generative Models
- Authors: Yasin Ibrahim, Hermione Warr, Konstantinos Kamnitsas,
- Abstract summary: We develop a semi-supervised deep causal generative model that exploits the causal relationships between variables to maximise the use of all available data.
We leverage techniques from causal inference to infer missing values and subsequently generate realistic counterfactuals.
- Score: 2.5847188023177403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing models that are capable of answering questions of the form "How would x change if y had been z?'" is fundamental to advancing medical image analysis. Training causal generative models that address such counterfactual questions, though, currently requires that all relevant variables have been observed and that the corresponding labels are available in the training data. However, clinical data may not have complete records for all patients and state of the art causal generative models are unable to take full advantage of this. We thus develop, for the first time, a semi-supervised deep causal generative model that exploits the causal relationships between variables to maximise the use of all available data. We explore this in the setting where each sample is either fully labelled or fully unlabelled, as well as the more clinically realistic case of having different labels missing for each sample. We leverage techniques from causal inference to infer missing values and subsequently generate realistic counterfactuals, even for samples with incomplete labels.
Related papers
- Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance [16.641794438414745]
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine.
We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution.
Our framework is extremely flexible and can be integrated with most existing model classes and global variable importance metrics.
arXiv Detail & Related papers (2023-09-24T23:09:48Z) - Self-Consuming Generative Models Go MAD [21.056900382589266]
We study how to use synthetic data to train generative AI algorithms for imagery, text, and other data types.
Without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease.
We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.
arXiv Detail & Related papers (2023-07-04T17:59:31Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Surgical Aggregation: Federated Class-Heterogeneous Learning [4.468858802955592]
We propose surgical aggregation, a federated learning framework for aggregating knowledge from class-heterogeneous datasets.
We evaluate our method using simulated and real-world class-heterogeneous datasets across both independent and identically distributed (iid) and non-iid settings.
arXiv Detail & Related papers (2023-01-17T03:53:29Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Can segmentation models be trained with fully synthetically generated
data? [0.39577682622066246]
BrainSPADE is a model which combines a synthetic diffusion-based label generator with a semantic image generator.
Our model can produce fully synthetic brain labels on-demand, with or without pathology of interest, and then generate a corresponding MRI image of an arbitrary guided style.
Experiments show that brainSPADE synthetic data can be used to train segmentation models with performance comparable to that of models trained on real data.
arXiv Detail & Related papers (2022-09-17T05:24:04Z) - A Hamiltonian Monte Carlo Model for Imputation and Augmentation of
Healthcare Data [0.6719751155411076]
Missing values exist in nearly all clinical studies because data for a variable or question are not collected or not available.
Existing models usually do not consider privacy concerns or do not utilise the inherent correlations across multiple features to impute the missing values.
A Bayesian approach to impute missing values and creating augmented samples in high dimensional healthcare data is proposed in this work.
arXiv Detail & Related papers (2021-03-03T11:57:42Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.