Related papers: μ-Bench: A Vision-Language Benchmark for Microscopy Understanding

μ-Bench: A Vision-Language Benchmark for Microscopy Understanding

URL: http://arxiv.org/abs/2407.01791v1
Date: Mon, 1 Jul 2024 20:30:26 GMT
Title: μ-Bench: A Vision-Language Benchmark for Microscopy Understanding
Authors: Alejandro Lozano, Jeffrey Nirschl, James Burgess, Sanket Rajan Gupte, Yuhui Zhang, Alyssa Unell, Serena Yeung-Levy,
Abstract summary: Vision-language models (VLMs) offer a promising solution for large-scale biological image analysis. There is a lack of standardized, diverse, and large-scale vision-language benchmarks to evaluate VLMs. mu-Bench is an expert-curated benchmark encompassing 22 biomedical tasks.
Score: 43.27182445778988
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in microscopy have enabled the rapid generation of terabytes of image data in cell biology and biomedical research. Vision-language models (VLMs) offer a promising solution for large-scale biological image analysis, enhancing researchers' efficiency, identifying new image biomarkers, and accelerating hypothesis generation and scientific discovery. However, there is a lack of standardized, diverse, and large-scale vision-language benchmarks to evaluate VLMs' perception and cognition capabilities in biological image understanding. To address this gap, we introduce {\mu}-Bench, an expert-curated benchmark encompassing 22 biomedical tasks across various scientific disciplines (biology, pathology), microscopy modalities (electron, fluorescence, light), scales (subcellular, cellular, tissue), and organisms in both normal and abnormal states. We evaluate state-of-the-art biomedical, pathology, and general VLMs on {\mu}-Bench and find that: i) current models struggle on all categories, even for basic tasks such as distinguishing microscopy modalities; ii) current specialist models fine-tuned on biomedical data often perform worse than generalist models; iii) fine-tuning in specific microscopy domains can cause catastrophic forgetting, eroding prior biomedical knowledge encoded in their base model. iv) weight interpolation between fine-tuned and pre-trained models offers one solution to forgetting and improves general performance across biomedical tasks. We release {\mu}-Bench under a permissive license to accelerate the research and development of microscopy foundation models.

Related papers

UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation [8.781512619275208]
We introduce UniBiomed, the first universal foundation model for grounded biomedical image interpretation. UniBiomed is based on a novel integration of Multi-modal Large Language Model (MLLM) and Segment Anything Model (SAM) To develop UniBiomed, we curate a large-scale dataset comprising over 27 million of images, annotations, and text descriptions across ten imaging modalities.
arXiv Detail & Related papers (2025-04-30T05:51:48Z)
CytoFM: The first cytology foundation model [3.591868126270513]
We introduce CytoFM, the first self-supervised foundation model for digital Cytology. We pretrain CytoFM on a diverse collection of datasets to learn robust, transferable representations. Our results demonstrate that CytoFM performs better on two out of three downstream tasks than existing foundation models pretrained on histopathology.
arXiv Detail & Related papers (2025-04-18T01:37:50Z)
Biomedical Foundation Model: A Survey [84.26268124754792]
Foundation models are large-scale pre-trained models that learn from extensive unlabeled datasets. These models can be adapted to various applications such as question answering and visual understanding. This survey explores the potential of foundation models across diverse domains within biomedical fields.
arXiv Detail & Related papers (2025-03-03T22:42:00Z)
Large Language Models for Bioinformatics [58.892165394487414]
This survey focuses on the evolution, classification, and distinguishing features of bioinformatics-specific language models (BioLMs) We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development. We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities.
arXiv Detail & Related papers (2025-01-10T01:43:05Z)
Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models [51.316001071698224]
We introduce Biology-Instructions, the first large-scale multi-omics biological sequences-related instruction-tuning dataset. This dataset can bridge the gap between large language models (LLMs) and complex biological sequences-related tasks. We also develop a strong baseline called ChatMultiOmics with a novel three-stage training pipeline.
arXiv Detail & Related papers (2024-12-26T12:12:23Z)
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy [3.432992120614117]
We present the largest foundation model for cell microscopy data to date. Compared to a previous published ViT-L/8 MAE, our new model achieves a 60% improvement in linear separability of genetic perturbations.
arXiv Detail & Related papers (2024-11-04T20:09:51Z)
Multimodal Large Language Models for Bioimage Analysis [39.120941702559726]
Multimodal Large Language Models (MLLMs) exhibit strong emergent capacities, such as understanding, analyzing, reasoning, and generalization. With these capabilities, MLLMs hold promise to extract intricate information from biological images and data obtained through various modalities. Development of MLLMs shows increasing promise in serving as intelligent assistants or agents for augmenting human researchers in biology research.
arXiv Detail & Related papers (2024-07-29T08:21:25Z)
Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images [0.6491172192043603]
We propose a set-level consistency learning algorithm, Set-DINO, to improve learned representations of perturbation effects in single-cell images. We conduct experiments on a large-scale Optical Pooled Screening dataset with more than 5000 genetic perturbations.
arXiv Detail & Related papers (2024-06-08T00:53:30Z)
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology [2.7280901660033643]
This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) Our results show that ViT-based MAEs outperform weakly supervised classifiers on a variety of tasks, achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases. We develop a new channel-agnostic MAE architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels at inference time.
arXiv Detail & Related papers (2024-04-16T02:42:06Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology. We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective. Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z)
Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges. We first present the model that underlies most of current causal approaches to single-cell biology. We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z)
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z)
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs [48.376109878173956]
We present PMC-15M, a novel dataset that is two orders of magnitude larger than existing biomedical multimodal datasets. PMC-15M contains 15 million biomedical image-text pairs collected from 4.4 million scientific articles. Based on PMC-15M, we have pretrained BiomedCLIP, a multimodal foundation model, with domain-specific adaptations tailored to biomedical vision-language processing.
arXiv Detail & Related papers (2023-03-02T02:20:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.