A Large-Language-Model Assisted Automated Scale Bar Detection and Extraction Framework for Scanning Electron Microscopic Images
- URL: http://arxiv.org/abs/2510.11260v1
- Date: Mon, 13 Oct 2025 10:50:54 GMT
- Title: A Large-Language-Model Assisted Automated Scale Bar Detection and Extraction Framework for Scanning Electron Microscopic Images
- Authors: Yuxuan Chen, Ruotong Yang, Zhengyang Zhang, Mehreen Ahmed, Yanming Wang,
- Abstract summary: We propose a multi-modal and automated scale bar detection and extraction framework.<n>It provides concurrent object detection, text detection and text recognition with a Large Language Model (LLM) agent.<n>The proposed framework significantly enhances the efficiency and accuracy of scale bar detection and extraction in SEM images.
- Score: 11.084738769064701
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Microscopic characterizations, such as Scanning Electron Microscopy (SEM), are widely used in scientific research for visualizing and analyzing microstructures. Determining the scale bars is an important first step of accurate SEM analysis; however, currently, it mainly relies on manual operations, which is both time-consuming and prone to errors. To address this issue, we propose a multi-modal and automated scale bar detection and extraction framework that provides concurrent object detection, text detection and text recognition with a Large Language Model (LLM) agent. The proposed framework operates in four phases; i) Automatic Dataset Generation (Auto-DG) model to synthesize a diverse dataset of SEM images ensuring robust training and high generalizability of the model, ii) scale bar object detection, iii) information extraction using a hybrid Optical Character Recognition (OCR) system with DenseNet and Convolutional Recurrent Neural Network (CRNN) based algorithms, iv) an LLM agent to analyze and verify accuracy of the results. The proposed model demonstrates a strong performance in object detection and accurate localization with a precision of 100%, recall of 95.8%, and a mean Average Precision (mAP) of 99.2% at IoU=0.5 and 69.1% at IoU=0.5:0.95. The hybrid OCR system achieved 89% precision, 65% recall, and a 75% F1 score on the Auto-DG dataset, significantly outperforming several mainstream standalone engines, highlighting its reliability for scientific image analysis. The LLM is introduced as a reasoning engine as well as an intelligent assistant that suggests follow-up steps and verifies the results. This automated method powered by an LLM agent significantly enhances the efficiency and accuracy of scale bar detection and extraction in SEM images, providing a valuable tool for microscopic analysis and advancing the field of scientific imaging.
Related papers
- An Agentic Framework for Autonomous Materials Computation [70.24472585135929]
Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
arXiv Detail & Related papers (2025-12-22T15:03:57Z) - Automated Morphological Analysis of Neurons in Fluorescence Microscopy Using YOLOv8 [0.0]
This work presents a pipeline for neuron instance segmentation and measurement based on a high-resolution dataset of stem-cell-derived neurons.<n>The proposed method uses YOLOv8, trained on manually annotated microscopy images. The model achieved high segmentation accuracy, exceeding 97%.<n>The overall accuracy of the extracted morphological measurements reached 75.32%, further supporting the effectiveness of the proposed approach.
arXiv Detail & Related papers (2025-10-22T10:35:08Z) - Crucial-Diff: A Unified Diffusion Model for Crucial Image and Annotation Synthesis in Data-scarce Scenarios [65.97836905826145]
scarcity of data in various scenarios, such as medical, industry and autonomous driving, leads to model overfitting and dataset imbalance.<n>We propose Crucial-Diff, a domain-agnostic framework designed to synthesize crucial samples.<n>Our framework generates diverse, high-quality training data, achieving a pixel-level AP of 83.63% and an F1-MAX of 78.12% on MVTec.
arXiv Detail & Related papers (2025-07-14T04:41:38Z) - AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use [43.82226218242389]
We introduce AutoMat, an end-to-end, agent-assisted pipeline that automatically transforms scanning transmission electron microscopy images into atomic crystal structures.<n>AutoMat combines pattern-adaptive denoising, physics-guided template retrieval, symmetry-aware atomic reconstruction, fast relaxation and property prediction via MatterSim.<n>In large-scale experiments over 450 structure samples, AutoMat substantially outperforms existing multimodal large language models and tools.
arXiv Detail & Related papers (2025-05-19T03:04:50Z) - Uni-AIMS: AI-Powered Microscopy Image Analysis [28.24402780080126]
We develop a data engine that generates high-quality annotated datasets.<n>We propose a segmentation model capable of robustly detecting both small and large objects.<n>Our solution supports the precise automatic recognition of image scale bars.
arXiv Detail & Related papers (2025-05-11T09:35:53Z) - Zero-shot Autonomous Microscopy for Scalable and Intelligent Characterization of 2D Materials [41.856704526703595]
characterization of atomic-scale materials traditionally requires human experts with months to years of specialized training.<n>This bottleneck drives demand for fully autonomous experimentation systems capable of comprehending research objectives without requiring large training datasets.<n>We present ATOMIC, an end-to-end framework that integrates foundation models to enable fully autonomous, zero-shot characterization of 2D materials.
arXiv Detail & Related papers (2025-04-14T14:49:45Z) - Automated Segmentation and Analysis of Microscopy Images of Laser Powder Bed Fusion Melt Tracks [0.0]
We present an image segmentation neural network that automatically identifies and measures melt track dimensions from a cross-section image.
We use a U-Net architecture to train on a data set of 62 pre-labelled images obtained from different labs, machines, and materials coupled with image augmentation.
arXiv Detail & Related papers (2024-09-26T22:44:00Z) - Accelerating Domain-Aware Electron Microscopy Analysis Using Deep Learning Models with Synthetic Data and Image-Wide Confidence Scoring [0.0]
We create a physics-based synthetic image and data generator, resulting in a machine learning model that achieves comparable precision (0.86), recall (0.63), F1 scores (0.71), and engineering property predictions (R2=0.82)
Our study demonstrates that synthetic data can eliminate human reliance in ML and provides a means for domain awareness in cases where many feature detections per image are needed.
arXiv Detail & Related papers (2024-08-02T20:15:15Z) - Automated detection of motion artifacts in brain MR images using deep
learning and explainable artificial intelligence [0.0]
This study demonstrates a deep learning model to detect rigid motion in T1-weighted brain images.
The model achieved average precision and recall metrics of 85% and 80% on six motion-simulated retrospective datasets.
This model is part of the ArtifactID tool, aimed at inline automatic detection of Gibbs ringing, wrap-around, and motion artifacts.
arXiv Detail & Related papers (2024-02-13T19:36:23Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Optimizations of Autoencoders for Analysis and Classification of
Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression.
The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z) - A parameter refinement method for Ptychography based on Deep Learning
concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability.
A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction.
We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.