Learning Causal Alignment for Reliable Disease Diagnosis
- URL: http://arxiv.org/abs/2310.01766v2
- Date: Fri, 07 Feb 2025 04:15:53 GMT
- Title: Learning Causal Alignment for Reliable Disease Diagnosis
- Authors: Mingzhou Liu, Ching-Wen Lee, Xinwei Sun, Yu Qiao, Yizhou Wang,
- Abstract summary: We propose a causality-based alignment framework towards aligning the model's decision process with that of experts.<n>We demonstrate the effectiveness of our method on two medical diagnosis applications.
- Score: 32.94918336343018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Aligning the decision-making process of machine learning algorithms with that of experienced radiologists is crucial for reliable diagnosis. While existing methods have attempted to align their diagnosis behaviors to those of radiologists reflected in the training data, this alignment is primarily associational rather than causal, resulting in pseudo-correlations that may not transfer well. In this paper, we propose a causality-based alignment framework towards aligning the model's decision process with that of experts. Specifically, we first employ counterfactual generation to identify the causal chain of model decisions. To align this causal chain with that of experts, we propose a causal alignment loss that enforces the model to focus on causal factors underlying each decision step in the whole causal chain. To optimize this loss that involves the counterfactual generator as an implicit function of the model's parameters, we employ the implicit function theorem equipped with the conjugate gradient method for efficient estimation. We demonstrate the effectiveness of our method on two medical diagnosis applications, showcasing faithful alignment to radiologists.
Related papers
- Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism.
Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors.
To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z) - Discrimination of Radiologists Utilizing Eye-Tracking Technology and
Machine Learning: A Case Study [0.9142067094647588]
This study presents a novel discretized feature encoding based on binning fixation data for efficient geometric alignment.
The encoded features of the eye-fixation data are employed by machine learning classifiers to discriminate between faculty and trainee radiologists.
arXiv Detail & Related papers (2023-08-04T23:51:47Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Instrumental Variable Learning for Chest X-ray Classification [52.68170685918908]
We propose an interpretable instrumental variable (IV) learning framework to eliminate the spurious association and obtain accurate causal representation.
Our approach's performance is demonstrated using the MIMIC-CXR, NIH ChestX-ray 14, and CheXpert datasets.
arXiv Detail & Related papers (2023-05-20T03:12:23Z) - Topological Interpretability for Deep-Learning [0.30806551485143496]
Deep learning (DL) models cannot quantify the certainty of their predictions.
This work presents a method to infer prominent features in two DL classification models trained on clinical and non-clinical text.
arXiv Detail & Related papers (2023-05-15T13:38:13Z) - Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Probabilistic Bearing Fault Diagnosis Using Gaussian Process with
Tailored Feature Extraction [10.064000794573756]
Rolling bearings are subject to various faults due to its long-time operation under harsh environment.
Current deep learning methods perform the bearing fault diagnosis in the form of deterministic classification.
We develop a probabilistic fault diagnosis framework that can account for the uncertainty effect in prediction.
arXiv Detail & Related papers (2021-09-19T18:34:29Z) - Tribrid: Stance Classification with Neural Inconsistency Detection [9.150728831518459]
We study the problem of performing automatic stance classification on social media with neural architectures such as BERT.
We present a new neural architecture where the input also includes automatically generated negated perspectives over a given claim.
The model is jointly learned to make simultaneously multiple predictions, which can be used either to improve the classification of the original perspective or to filter out doubtful predictions.
arXiv Detail & Related papers (2021-09-14T08:13:03Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - On the (Un-)Avoidability of Adversarial Examples [4.822598110892847]
adversarial examples in deep learning models have caused substantial concern over their reliability.
We provide a framework for determining whether a model's label change under small perturbation is justified.
We prove that our adaptive data-augmentation maintains consistency of 1-nearest neighbor classification under deterministic labels.
arXiv Detail & Related papers (2021-06-24T21:35:25Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Harmonization with Flow-based Causal Inference [12.739380441313022]
This paper presents a normalizing-flow-based method to perform counterfactual inference upon a structural causal model (SCM) to harmonize medical data.
We evaluate on multiple, large, real-world medical datasets to observe that this method leads to better cross-domain generalization compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-06-12T19:57:35Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - Self-Guided Multiple Instance Learning for Weakly Supervised Disease
Classification and Localization in Chest Radiographs [22.473965401043717]
We introduce a novel loss function for training convolutional neural networks increasing the emphlocalization confidence
We show that the supervision provided within the proposed learning scheme leads to better performance and more precise predictions on prevalent datasets for multiple-instance learning.
arXiv Detail & Related papers (2020-09-30T22:19:40Z) - Neural Networks with Recurrent Generative Feedback [61.90658210112138]
We instantiate this design on convolutional neural networks (CNNs)
In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.
arXiv Detail & Related papers (2020-07-17T19:32:48Z) - A Critical View of the Structural Causal Model [89.43277111586258]
We show that one can identify the cause and the effect without considering their interaction at all.
We propose a new adversarial training method that mimics the disentangled structure of the causal model.
Our multidimensional method outperforms the literature methods on both synthetic and real world datasets.
arXiv Detail & Related papers (2020-02-23T22:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.