Human-aligned Deep Learning: Explainability, Causality, and Biological Inspiration
- URL: http://arxiv.org/abs/2504.13717v1
- Date: Fri, 18 Apr 2025 14:40:58 GMT
- Title: Human-aligned Deep Learning: Explainability, Causality, and Biological Inspiration
- Authors: Gianluca Carloni,
- Abstract summary: This work aligns deep learning (DL) with human reasoning capabilities and needs to enable more efficient, interpretable, and robust image classification.<n>We approach this from three perspectives: explainability, causality, and biological vision.<n>Overall, our key findings include: (i) simple activation lacks insight for medical imaging DL models; (ii) prototypical-part learning is effective and radiologically aligned; (iii) XAI and causal ML are deeply connected; (iv) weak causal signals can be leveraged without a priori information to improve performance and interpretability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work aligns deep learning (DL) with human reasoning capabilities and needs to enable more efficient, interpretable, and robust image classification. We approach this from three perspectives: explainability, causality, and biological vision. Introduction and background open this work before diving into operative chapters. First, we assess neural networks' visualization techniques for medical images and validate an explainable-by-design method for breast mass classification. A comprehensive review at the intersection of XAI and causality follows, where we introduce a general scaffold to organize past and future research, laying the groundwork for our second perspective. In the causality direction, we propose novel modules that exploit feature co-occurrence in medical images, leading to more effective and explainable predictions. We further introduce CROCODILE, a general framework that integrates causal concepts, contrastive learning, feature disentanglement, and prior knowledge to enhance generalization. Lastly, we explore biological vision, examining how humans recognize objects, and propose CoCoReco, a connectivity-inspired network with context-aware attention mechanisms. Overall, our key findings include: (i) simple activation maximization lacks insight for medical imaging DL models; (ii) prototypical-part learning is effective and radiologically aligned; (iii) XAI and causal ML are deeply connected; (iv) weak causal signals can be leveraged without a priori information to improve performance and interpretability; (v) our framework generalizes across medical domains and out-of-distribution data; (vi) incorporating biological circuit motifs improves human-aligned recognition. This work contributes toward human-aligned DL and highlights pathways to bridge the gap between research and clinical adoption, with implications for improved trust, diagnostic accuracy, and safe deployment.
Related papers
- Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources.
We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z) - Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping [7.101960527853822]
Tokensome is a novel vision-language model based on chromosome tokenization for explainable and cognitive karyotyping.
Tokensome elevates the method from the conventional visual perception layer to the cognitive decision-making layer.
arXiv Detail & Related papers (2024-03-17T03:38:50Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Knowledge Boosting: Rethinking Medical Contrastive Vision-Language
Pre-Training [6.582001681307021]
We propose the Knowledge-Boosting Contrastive Vision-Language Pre-training framework (KoBo)
KoBo integrates clinical knowledge into the learning of vision-language semantic consistency.
Experiments validate the effect of our framework on eight tasks including classification, segmentation, retrieval, and semantic relatedness.
arXiv Detail & Related papers (2023-07-14T09:38:22Z) - Multi-task Collaborative Pre-training and Individual-adaptive-tokens
Fine-tuning: A Unified Framework for Brain Representation Learning [3.1453938549636185]
We propose a unified framework that combines Collaborative pre-training and Individual--Tokens fine-tuning.
The proposed MCIAT achieves state-of-the-art diagnosis performance on the ADHD-200 dataset.
arXiv Detail & Related papers (2023-06-20T08:38:17Z) - Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Representation Learning for Networks in Biology and Medicine:
Advancements, Challenges, and Opportunities [18.434430658837258]
We have witnessed a rapid expansion of representation learning techniques into modeling, analysis, and learning with networks.
In this review, we put forward an observation that long-standing principles of network biology and medicine can provide the conceptual grounding for representation learning.
We synthesize a spectrum of algorithmic approaches that leverage topological features to embed networks into compact vector spaces.
arXiv Detail & Related papers (2021-04-11T00:20:00Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.