Related papers: Toward Clinically Grounded Foundation Models in Pathology

Toward Clinically Grounded Foundation Models in Pathology

URL: http://arxiv.org/abs/2510.23807v3
Date: Thu, 06 Nov 2025 10:01:43 GMT
Title: Toward Clinically Grounded Foundation Models in Pathology
Authors: Hamid R. Tizhoosh,
Abstract summary: Foundations models (FMs) have revolutionized computer vision and language processing through large-scale self-supervised and multimodal learning.<n>Recent evaluations reveal fundamental weaknesses: low diagnostic accuracy, poor robustness, geometric instability, heavy computational demands, and concerning safety vulnerabilities.<n>This paper argues that they stem from deeper conceptual mismatches between the assumptions underlying generic foundation modeling in mainstream AI and the intrinsic complexity of human tissue.
Score: 3.202978695204522
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In non-medical domains, foundation models (FMs) have revolutionized computer vision and language processing through large-scale self-supervised and multimodal learning. Consequently, their rapid adoption in computational pathology was expected to deliver comparable breakthroughs in cancer diagnosis, prognostication, and multimodal retrieval. However, recent systematic evaluations reveal fundamental weaknesses: low diagnostic accuracy, poor robustness, geometric instability, heavy computational demands, and concerning safety vulnerabilities. This short paper examines these shortcomings and argues that they stem from deeper conceptual mismatches between the assumptions underlying generic foundation modeling in mainstream AI and the intrinsic complexity of human tissue. Seven interrelated causes are identified: biological complexity, ineffective self-supervision, overgeneralization, excessive architectural complexity, lack of domain-specific innovation, insufficient data, and a fundamental design flaw related to tissue patch size. These findings suggest that current pathology foundation models remain conceptually misaligned with the nature of tissue morphology and call for a fundamental rethinking of the paradigm itself.

Related papers

A Brain-like Synergistic Core in LLMs Drives Behaviour and Learning [50.68188138112555]
We show that large language models spontaneously develop synergistic cores.<n>We find that areas in middle layers exhibit synergistic processing while early and late layers rely on redundancy.<n>This convergence suggests that synergistic information processing is a fundamental property of intelligence.
arXiv Detail & Related papers (2026-01-11T10:48:35Z)
Anatomy-R1: Enhancing Anatomy Reasoning in Multimodal Large Language Models via Anatomical Similarity Curriculum and Group Diversity Augmentation [52.7583577508452]
Multimodal Large Language Models (MLLMs) have achieved impressive progress in natural image reasoning.<n>Their potential in medical imaging remains underexplored, especially in clinical anatomical surgical images.<n>These challenges limit the effectiveness of conventionalSupervised Fine-Tuning strategies.
arXiv Detail & Related papers (2025-12-22T16:06:36Z)
Foundation Models in Biomedical Imaging: Turning Hype into Reality [17.139610489482262]
Foundation models (FMs) are driving a prominent shift in artificial intelligence across different domains, including biomedical imaging.<n>We critically assess the current state-of-the-art, analyzing hype by examining the core capabilities and limitations of FMs in the biomedical domain.<n>We discuss the paramount issues in deployment stemming from trustworthiness, bias, and safety, dissecting the challenges of algorithmic bias, data bias and privacy, and model hallucinations.
arXiv Detail & Related papers (2025-12-17T05:18:43Z)
A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z)
Deconstructing Intraocular Pressure: A Non-invasive Multi-Stage Probabilistic Inverse Framework [0.0]
Glaucoma is a leading cause of irreversible blindness driven by elevated intraocular pressure (IOP)<n>We develop a framework to noninvasively estimate unmeasurable variables from sparse, routine data.<n>Our framework achieves excellent agreement with state-of-the-art tonography with precision comparable to direct physical instruments.
arXiv Detail & Related papers (2025-09-17T16:50:23Z)
Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z)
Signal in the Noise: Polysemantic Interference Transfers and Predicts Cross-Model Influence [46.548276232795466]
Polysemanticity is pervasive in language models and remains a major challenge for interpretation and model behavioral control.<n>We map the polysemantic topology of two small models to identify feature pairs that are semantically unrelated yet exhibit interference within models.<n>We intervene at four loci (prompt, token, feature, neuron) and measure induced shifts in the next-token prediction distribution, uncovering polysemantic structures that expose a systematic vulnerability in these models.
arXiv Detail & Related papers (2025-05-16T18:20:42Z)
PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models [59.17570021208177]
PyTDC is a machine-learning platform providing streamlined training, evaluation, and inference software for multimodal biological AI models.<n>This paper discusses the components of PyTDC's architecture and, to our knowledge, the first-of-its-kind case study on the introduced single-cell drug-target nomination ML task.
arXiv Detail & Related papers (2025-05-08T18:15:38Z)
Causal Disentanglement for Robust Long-tail Medical Image Generation [80.15257897500578]
We propose a novel medical image generation framework, which generates independent pathological and structural features.<n>We leverage a diffusion model guided by pathological findings to model pathological features, enabling the generation of diverse counterfactual images.
arXiv Detail & Related papers (2025-04-20T01:54:18Z)
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact [0.34826922265324145]
Generative AI "co-pilots" now demonstrate the ability to mine subtle, sub-visual tissue cues across the cellular-to-pathology spectrum.<n>The scale of data has surged dramatically, growing from tens to millions of multi-gigapixel tissue images.<n>We explore the true potential of these innovations and their integration into clinical practice.
arXiv Detail & Related papers (2025-02-12T11:57:11Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts. Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z)
Multimodal and multicontrast image fusion via deep generative models [3.431015735214097]
We propose a deep learning architecture based on generative models rooted in a modular approach and separable convolutional blocks to fuse multiple 3D neuroimaging modalities on a voxel-wise level. This may be of aid in predicting disease evolution as well as drug response, hence supporting mechanistic understanding and empowering clinical trials.
arXiv Detail & Related papers (2023-03-28T13:31:27Z)
Demystifying Deep Learning Models for Retinal OCT Disease Classification using Explainable AI [0.6117371161379209]
The adoption of various deep learning techniques is quite common as well as effective, and its statement is equally true when it comes to implementing it into the retina Optical Coherence Tomography sector. These techniques have the black box characteristics that prevent the medical professionals to completely trust the results generated from them. This paper proposes a self-developed CNN model which is comparatively smaller and simpler along with the use of Lime that introduces Explainable AI to the study.
arXiv Detail & Related papers (2021-11-06T13:54:07Z)
Data-driven generation of plausible tissue geometries for realistic photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties. We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate" We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z)
Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs To Improve Generalization [1.7371375427784381]
Existing CNN models act as black boxes, not ensuring to the physicians that important diagnostic features are used by the model. Here we show that our architecture, by learning end-to-end an uncertainty-based weighting combination of multi-task and adversarial losses, is encouraged to focus on pathology features. Our results on breast lymph node tissue show significantly improved generalization in the detection of tumorous tissue, with best average AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005)
arXiv Detail & Related papers (2020-08-04T12:10:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.