PathVG: A New Benchmark and Dataset for Pathology Visual Grounding
- URL: http://arxiv.org/abs/2502.20869v1
- Date: Fri, 28 Feb 2025 09:13:01 GMT
- Title: PathVG: A New Benchmark and Dataset for Pathology Visual Grounding
- Authors: Chunlin Zhong, Shuang Hao, Junhua Wu, Xiaona Chang, Jiwei Jiang, Xiu Nie, He Tang, Xiang Bai,
- Abstract summary: We propose a new benchmark called Pathology Visual Grounding (PathVG), which aims to detect regions based on expressions with different attributes.<n>In the experimental study, we found that the biggest challenge was the implicit information underlying the pathological expressions.<n>The proposed method achieves state-of-the-art performance on the PathVG benchmark.
- Score: 45.21597220882424
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the rapid development of computational pathology, many AI-assisted diagnostic tasks have emerged. Cellular nuclei segmentation can segment various types of cells for downstream analysis, but it relies on predefined categories and lacks flexibility. Moreover, pathology visual question answering can perform image-level understanding but lacks region-level detection capability. To address this, we propose a new benchmark called Pathology Visual Grounding (PathVG), which aims to detect regions based on expressions with different attributes. To evaluate PathVG, we create a new dataset named RefPath which contains 27,610 images with 33,500 language-grounded boxes. Compared to visual grounding in other domains, PathVG presents pathological images at multi-scale and contains expressions with pathological knowledge. In the experimental study, we found that the biggest challenge was the implicit information underlying the pathological expressions. Based on this, we proposed Pathology Knowledge-enhanced Network (PKNet) as the baseline model for PathVG. PKNet leverages the knowledge-enhancement capabilities of Large Language Models (LLMs) to convert pathological terms with implicit information into explicit visual features, and fuses knowledge features with expression features through the designed Knowledge Fusion Module (KFM). The proposed method achieves state-of-the-art performance on the PathVG benchmark.
Related papers
- PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.
Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.
Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis [81.19923502845441]
We develop a graph-based framework that constructs WSI graph representations.
We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches.
In our method's final step, we solve the diagnostic task through a graph attention network.
arXiv Detail & Related papers (2025-03-14T20:15:04Z) - Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images [7.048241543461529]
We propose a novel framework called Multi-Resolution Prompt-guided Hybrid Embedding (MR-PHE) to address these challenges in zero-shot histopathology image classification.
We introduce a hybrid embedding strategy that integrates global image embeddings with weighted patch embeddings.
A similarity-based patch weighting mechanism assigns attention-like weights to patches based on their relevance to class embeddings.
arXiv Detail & Related papers (2025-03-13T12:18:37Z) - Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification [0.20971479389679332]
We implement an encoder-decoder architecture with a consistent decoder and various encoders.<n>We evaluate instance-level detection, segmentation accuracy, and cell-type classification.<n>This study provides insights into the comparative strengths and limitations of general-purpose vs. histopathology foundation models.
arXiv Detail & Related papers (2025-02-04T16:47:00Z) - Path-RAG: Knowledge-Guided Key Region Retrieval for Open-ended Pathology Visual Question Answering [38.86674352317965]
We propose a novel framework named Path-RAG to retrieve relevant domain knowledge from pathology images.
Our experiments suggest that domain guidance can significantly boost the accuracy of LLaVA-Med from 38% to 47%.
For longer-form question and answer pairs, our model consistently achieves significant improvements of 32.5% in ARCH-Open PubMed and 30.6% in ARCH-Open Books on H&E images.
arXiv Detail & Related papers (2024-11-26T03:22:01Z) - Knowledge-enhanced Visual-Language Pretraining for Computational Pathology [68.6831438330526]
We consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources.
We curate a pathology knowledge tree that consists of 50,470 informative attributes for 4,718 diseases requiring pathology diagnosis from 32 human tissues.
arXiv Detail & Related papers (2024-04-15T17:11:25Z) - Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning [64.1316997189396]
We present a novel language-tied self-supervised learning framework, Hierarchical Language-tied Self-Supervision (HLSS) for histopathology images.
Our resulting model achieves state-of-the-art performance on two medical imaging benchmarks, OpenSRH and TCGA datasets.
arXiv Detail & Related papers (2024-03-21T17:58:56Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - HistoSegCap: Capsules for Weakly-Supervised Semantic Segmentation of
Histological Tissue Type in Whole Slide Images [19.975420988169454]
Digital pathology involves converting physical tissue slides into high-resolution Whole Slide Images (WSIs)
Large histology slides with numerous microscopic fields pose challenges for visual search.
Computer Aided Diagnosis (CAD) systems offer visual assistance in efficiently examining WSIs and identifying diagnostically relevant regions.
arXiv Detail & Related papers (2024-02-16T17:44:11Z) - The Whole Pathological Slide Classification via Weakly Supervised
Learning [7.313528558452559]
We introduce two pathological priors: nuclear disease of cells and spatial correlation of pathological tiles.
We propose a data augmentation method that utilizes stain separation during extractor training.
We then describe the spatial relationships between the tiles using an adjacency matrix.
By integrating these two views, we designed a multi-instance framework for analyzing H&E-stained tissue images.
arXiv Detail & Related papers (2023-07-12T16:14:23Z) - PathAsst: A Generative Foundation AI Assistant Towards Artificial
General Intelligence of Pathology [15.419350834457136]
We present PathAsst, a multimodal generative foundation AI assistant to revolutionize diagnostic and predictive analytics in pathology.
The development of PathAsst involves three pivotal steps: data acquisition, CLIP model adaptation, and the training of PathAsst's multimodal generative capabilities.
The experimental results of PathAsst show the potential of harnessing AI-powered generative foundation model to improve pathology diagnosis and treatment processes.
arXiv Detail & Related papers (2023-05-24T11:55:50Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.