Related papers: Deep learning for fast segmentation and critical dimension metrology & characterization enabling AR/VR design and fabrication

Deep learning for fast segmentation and critical dimension metrology & characterization enabling AR/VR design and fabrication

URL: http://arxiv.org/abs/2409.13951v1
Date: Fri, 20 Sep 2024 23:54:58 GMT
Title: Deep learning for fast segmentation and critical dimension metrology & characterization enabling AR/VR design and fabrication
Authors: Kundan Chaudhary, Subhei Shaar, Raja Muthinti,
Abstract summary: We report on the fine-tuning of a pre-trained Segment Anything Model (SAM) using a diverse dataset of electron microscopy images. We employ methods such as low-rank adaptation (LoRA) to reduce training time and enhance the accuracy of ROI extraction. The model's ability to generalize to unseen images facilitates zero-shot learning and supports a CD extraction model.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Quantitative analysis of microscopy images is essential in the design and fabrication of components used in augmented reality/virtual reality (AR/VR) modules. However, segmenting regions of interest (ROIs) from these complex images and extracting critical dimensions (CDs) requires novel techniques, such as deep learning models which are key for actionable decisions on process, material and device optimization. In this study, we report on the fine-tuning of a pre-trained Segment Anything Model (SAM) using a diverse dataset of electron microscopy images. We employed methods such as low-rank adaptation (LoRA) to reduce training time and enhance the accuracy of ROI extraction. The model's ability to generalize to unseen images facilitates zero-shot learning and supports a CD extraction model that precisely extracts CDs from the segmented ROIs. We demonstrate the accurate extraction of binary images from cross-sectional images of surface relief gratings (SRGs) and Fresnel lenses in both single and multiclass modes. Furthermore, these binary images are used to identify transition points, aiding in the extraction of relevant CDs. The combined use of the fine-tuned segmentation model and the CD extraction model offers substantial advantages to various industrial applications by enhancing analytical capabilities, time to data and insights, and optimizing manufacturing processes.

Related papers

Studying Image Diffusion Features for Zero-Shot Video Object Segmentation [9.79891280451409]
This paper investigates the use of large-scale diffusion models for Zero-Shot Video Object (ZS-VOS) We find that diffusion models trained on ImageNet outperform those trained on larger, more diverse datasets for ZS-VOS. Our approach performs on par with models trained on expensive image segmentation datasets.
arXiv Detail & Related papers (2025-04-07T19:58:25Z)
Adapting Segment Anything Model (SAM) to Experimental Datasets via Fine-Tuning on GAN-based Simulation: A Case Study in Additive Manufacturing [1.8547557605937304]
Segment Anything Model (SAM) is designed for general-purpose image segmentation. In this work, we explore the application and limitations of SAM for industrial X-ray CT inspection of additive manufacturing components. We propose a fine-tuning strategy utilizing parameter-efficient techniques, specifically Conv-LoRa, to adapt SAM for material-specific datasets.
arXiv Detail & Related papers (2024-12-16T02:11:19Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements. Our model is based on neural operators, a discretization-agnostic architecture. Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance Imaging [0.10434396204054465]
ATOMMIC is an open-source toolbox that streamlines AI applications for accelerated MRI reconstruction and analysis. ATOMMIC implements several tasks using DL networks and enables MultiTask Learning (MTL) to perform related tasks integrated, targeting generalization in the MRI domain.
arXiv Detail & Related papers (2024-04-30T16:00:21Z)
OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System [7.1083241462091165]
We introduce an external modality-guided data mining framework, primarily rooted in optical character recognition (OCR), to extract statistical features from images. A key aspect of our approach is the alignment of external modality features, extracted using a single modality-aware model, with image features encoded by a convolutional neural network. Our methodology considerably boosts the recall rate of the defect detection model and maintains high robustness even in challenging scenarios.
arXiv Detail & Related papers (2024-03-18T07:41:39Z)
MatSAM: Efficient Extraction of Microstructures of Materials via Visual Large Model [11.130574172301365]
Segment Anything Model (SAM) is a large visual model with powerful deep feature representation and zero-shot generalization capabilities. In this paper, we propose MatSAM, a general and efficient microstructure extraction solution based on SAM. A simple yet effective point-based prompt generation strategy is designed, grounded on the distribution and shape of microstructures.
arXiv Detail & Related papers (2024-01-11T03:18:18Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
FAST-AID Brain: Fast and Accurate Segmentation Tool using Artificial Intelligence Developed for Brain [0.8376091455761259]
A novel deep learning method is proposed for fast and accurate segmentation of the human brain into 132 regions. The proposed model uses an efficient U-Net-like network and benefits from the intersection points of different views and hierarchical relations. The proposed method can be applied to brain MRI data including skull or any other artifacts without preprocessing the images or a drop in performance.
arXiv Detail & Related papers (2022-08-30T16:06:07Z)
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model [97.9548609175831]
We resort to plain vision transformers with about 100 million parameters and make the first attempt to propose large vision models customized for remote sensing tasks. Specifically, to handle the large image size and objects of various orientations in RS images, we propose a new rotated varied-size window attention. Experiments on detection tasks demonstrate the superiority of our model over all state-of-the-art models, achieving 81.16% mAP on the DOTA-V1.0 dataset.
arXiv Detail & Related papers (2022-08-08T09:08:40Z)
Contrastive Multiview Coding with Electro-optics for SAR Semantic Segmentation [0.6445605125467573]
We propose multi-modal representation learning for SAR semantic segmentation. Unlike previous studies, our method jointly uses EO imagery, SAR imagery, and a label mask. Several experiments show that our approach is superior to the existing methods in model performance, sample efficiency, and convergence speed.
arXiv Detail & Related papers (2021-08-31T23:55:41Z)
MOGAN: Morphologic-structure-aware Generative Learning from a Single Image [59.59698650663925]
Recently proposed generative models complete training based on only one image. We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances. Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z)
Shared Space Transfer Learning for analyzing multi-site fMRI data [83.41324371491774]
Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data. MVPA works best with a well-designed feature set and an adequate sample size. Most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes. This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning approach.
arXiv Detail & Related papers (2020-10-24T08:50:26Z)
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.