Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data
- URL: http://arxiv.org/abs/2506.24039v1
- Date: Mon, 30 Jun 2025 16:45:23 GMT
- Title: Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data
- Authors: Shubhabrata Mukherjee, Jack Lang, Obeen Kwon, Iryna Zenyuk, Valerie Brogden, Adam Weber, Daniela Ushizima,
- Abstract summary: Zenesis is a comprehensive no-code interactive platform designed to minimize barriers posed by data readiness for scientific images.<n>We develop lightweight multi-modal adaptation techniques that enable zero-shot operation on raw scientific data.<n>Our results demonstrate that Zenesis is a powerful tool for scientific applications, particularly in fields where high-quality annotated datasets are unavailable.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot and prompt-based technologies capitalized on using frequently occurring images to transform visual reasoning tasks, which explains why such technologies struggle with valuable yet scarce scientific image sets. In this work, we propose Zenesis, a comprehensive no-code interactive platform designed to minimize barriers posed by data readiness for scientific images. We develop lightweight multi-modal adaptation techniques that enable zero-shot operation on raw scientific data, along with human-in-the-loop refinement and heuristic-based temporal enhancement options. We demonstrate the performance of our approach through comprehensive comparison and validation on challenging Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) data of catalyst-loaded membranes. Zenesis significantly outperforms baseline methods, achieving an average accuracy of 0.947, an Intersection over Union (IOU) of 0.858, and a Dice score of 0.923 for amorphous catalyst samples and accuracy of 0.987, an IOU of 0.857, and a Dice score of 0.923 for crystalline samples. These results mark a substantial improvement over traditional methods like Otsu thresholding and even advanced models like Segment Anything Model (SAM) when used in isolation. Our results demonstrate that Zenesis is a powerful tool for scientific applications, particularly in fields where high-quality annotated datasets are unavailable, accelerating accurate analysis of experimental imaging.
Related papers
- Physics Informed Generative AI Enabling Labour Free Segmentation For Microscopy Analysis [3.3176565054468714]
This paper introduces a novel framework for labour-free segmentation that successfully bridges the simulation-to-reality gap.<n>We employ a Cycle-Consistent Generative Adversarial Network (CycleGAN) for unpaired image-to-image translation.<n>A U-Net model, trained exclusively on this synthetic data, demonstrated remarkable generalisation when deployed on unseen experimental images.
arXiv Detail & Related papers (2026-02-02T06:36:06Z) - XDen-1K: A Density Field Dataset of Real-World Objects [48.479432547763025]
We introduce XDen-1K, the first dataset designed for real-world physical property estimation.<n>The core of this dataset consists of 1,000 real-world objects across 148 categories.<n>We introduce a novel optimization framework that recovers a high-fidelity volumetric density field of each object from its sparse X-ray views.
arXiv Detail & Related papers (2025-12-11T14:15:42Z) - Zero-Shot Image Anomaly Detection Using Generative Foundation Models [2.241618130319058]
This research explores the use of score-based generative models as foundational tools for semantic anomaly detection.<n>By analyzing Stein score errors, we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset.<n>Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution.
arXiv Detail & Related papers (2025-07-30T13:56:36Z) - Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection [0.3011426942929757]
This study presents a novel image synthesis methodology tailored for construction worker detection.<n>The approach entails generating a collection of 12,000 synthetic images by formulating 3000 different prompts.<n> Evaluation on a real construction image dataset yielded promising results.
arXiv Detail & Related papers (2025-07-17T15:35:27Z) - Crucial-Diff: A Unified Diffusion Model for Crucial Image and Annotation Synthesis in Data-scarce Scenarios [65.97836905826145]
scarcity of data in various scenarios, such as medical, industry and autonomous driving, leads to model overfitting and dataset imbalance.<n>We propose Crucial-Diff, a domain-agnostic framework designed to synthesize crucial samples.<n>Our framework generates diverse, high-quality training data, achieving a pixel-level AP of 83.63% and an F1-MAX of 78.12% on MVTec.
arXiv Detail & Related papers (2025-07-14T04:41:38Z) - HistoART: Histopathology Artifact Detection and Reporting Tool [37.31105955164019]
Whole Slide Imaging (WSI) is widely used to digitize tissue specimens for detailed, high-resolution examination.<n>WSI remains vulnerable to artifacts introduced during slide preparation and scanning.<n>We propose and compare three robust artifact detection approaches for WSIs.
arXiv Detail & Related papers (2025-06-23T17:22:19Z) - Appeal prediction for AI up-scaled Images [45.61706071739717]
We describe our developed dataset, which uses 136 base images and five different up-scaling methods.<n>We evaluate the appeal of the different methods, and the results indicate that Real-ESRGAN and BSRGAN are the best.<n>In addition to this, we evaluate state-of-the-art image appeal and quality models, here none of the models showed a high prediction performance.
arXiv Detail & Related papers (2025-02-19T13:45:24Z) - Efficient Brain Tumor Classification with Lightweight CNN Architecture: A Novel Approach [0.0]
Brain tumor classification using MRI images is critical in medical diagnostics, where early and accurate detection significantly impacts patient outcomes.<n>Recent advancements in deep learning (DL) have shown promise, but many models struggle with balancing accuracy and computational efficiency.<n>We propose a novel model architecture integrating separable convolutions and squeeze and excitation (SE) blocks, designed to enhance feature extraction while maintaining computational efficiency.
arXiv Detail & Related papers (2025-02-01T21:06:42Z) - Merging synthetic and real embryo data for advanced AI predictions [69.07284335967019]
We train two generative models using two datasets-one we created and made publicly available, and one existing public dataset-to generate synthetic embryo images at various cell stages.<n>These were combined with real images to train classification models for embryo cell stage prediction.<n>Our results demonstrate that incorporating synthetic images alongside real data improved classification performance, with the model achieving 97% accuracy compared to 94.5% when trained solely on real data.
arXiv Detail & Related papers (2024-12-02T08:24:49Z) - SMILE-UHURA Challenge -- Small Vessel Segmentation at Mesoscopic Scale from Ultra-High Resolution 7T Magnetic Resonance Angiograms [60.35639972035727]
The lack of publicly available annotated datasets has impeded the development of robust, machine learning-driven segmentation algorithms.
The SMILE-UHURA challenge addresses the gap in publicly available annotated datasets by providing an annotated dataset of Time-of-Flight angiography acquired with 7T MRI.
Dice scores reached up to 0.838 $pm$ 0.066 and 0.716 $pm$ 0.125 on the respective datasets, with an average performance of up to 0.804 $pm$ 0.15.
arXiv Detail & Related papers (2024-11-14T17:06:00Z) - Dataset Distillation for Histopathology Image Classification [46.04496989951066]
We introduce a novel dataset distillation algorithm tailored for histopathology image datasets (Histo-DD)
We conduct a comprehensive evaluation of the effectiveness of the proposed algorithm and the generated histopathology samples in both patch-level and slide-level classification tasks.
arXiv Detail & Related papers (2024-08-19T05:53:38Z) - Distributional Drift Detection in Medical Imaging with Sketching and Fine-Tuned Transformer [2.7552551107566137]
This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images.<n>We developed a robust baseline library model for real-time anomaly detection, allowing for efficient comparison of incoming images.<n>We fine-tuned a pre-trained Vision Transformer model to extract relevant features, using mammography as a case study.
arXiv Detail & Related papers (2024-08-15T23:46:37Z) - Accelerating Domain-Aware Electron Microscopy Analysis Using Deep Learning Models with Synthetic Data and Image-Wide Confidence Scoring [0.0]
We create a physics-based synthetic image and data generator, resulting in a machine learning model that achieves comparable precision (0.86), recall (0.63), F1 scores (0.71), and engineering property predictions (R2=0.82)
Our study demonstrates that synthetic data can eliminate human reliance in ML and provides a means for domain awareness in cases where many feature detections per image are needed.
arXiv Detail & Related papers (2024-08-02T20:15:15Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - Early Diagnosis of Retinal Blood Vessel Damage via Deep Learning-Powered
Collective Intelligence Models [0.3670422696827525]
The power of swarm algorithms is used to search for various combinations of convolutional, pooling, and normalization layers to provide the best model for the task.
The best TDCN model achieves an accuracy of 90.3%, AUC ROC of 0.956, and a Cohen score of 0.967.
arXiv Detail & Related papers (2022-10-17T21:38:38Z) - Robust deep learning for eye fundus images: Bridging real and synthetic data for enhancing generalization [0.8599177028761124]
This work compares ten different GAN architectures to generate synthetic eye-fundus images with and without AMD.
StyleGAN2 reached the lowest Frechet Inception Distance (166.17), and clinicians could not accurately differentiate between real and synthetic images.
The accuracy rates were 82.8% for the test set and 81.3% for the STARE dataset, demonstrating the model's generalizability.
arXiv Detail & Related papers (2022-03-25T18:42:20Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Classification of COVID-19 in CT Scans using Multi-Source Transfer
Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans.
With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet.
Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z) - SCREENet: A Multi-view Deep Convolutional Neural Network for
Classification of High-resolution Synthetic Mammographic Screening Scans [3.8137985834223502]
We develop and evaluate a multi-view deep learning approach to the analysis of high-resolution synthetic mammograms.
We assess the effect on accuracy of image resolution and training set size.
arXiv Detail & Related papers (2020-09-18T00:12:33Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.