Related papers: PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents

PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents

URL: http://arxiv.org/abs/2410.09034v1
Date: Fri, 11 Oct 2024 17:50:59 GMT
Title: PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents
Authors: Xiangyu Yin, Chuqiao Shi, Yimo Han, Yi Jiang,
Abstract summary: In practice, obtaining high-quality ptychographic images requires simultaneous optimization of numerous experimental and algorithmic parameters. In this work, we develop a framework that leverages large language models (LLMs) to automate data analysis in ptychography. Our study demonstrates that PEAR's multi-agent design significantly improves the workflow success rate, even with smaller open-weight models.
Score: 6.6004056020499355
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ptychography is an advanced computational imaging technique in X-ray and electron microscopy. It has been widely adopted across scientific research fields, including physics, chemistry, biology, and materials science, as well as in industrial applications such as semiconductor characterization. In practice, obtaining high-quality ptychographic images requires simultaneous optimization of numerous experimental and algorithmic parameters. Traditionally, parameter selection often relies on trial and error, leading to low-throughput workflows and potential human bias. In this work, we develop the "Ptychographic Experiment and Analysis Robot" (PEAR), a framework that leverages large language models (LLMs) to automate data analysis in ptychography. To ensure high robustness and accuracy, PEAR employs multiple LLM agents for tasks including knowledge retrieval, code generation, parameter recommendation, and image reasoning. Our study demonstrates that PEAR's multi-agent design significantly improves the workflow success rate, even with smaller open-weight models such as LLaMA 3.1 8B. PEAR also supports various automation levels and is designed to work with customized local knowledge bases, ensuring flexibility and adaptability across different research environments.

Related papers

MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability [106.35604230971396]
Recent advancements in Agent techniques enable Large Language Models (LLMs) to autonomously utilize tools for retrieval, planning, and reasoning.<n>To further enhance the universal search capability of agents, we propose a novel pre-training framework, MaskSearch.<n>In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans.<n>After that, the model is trained on downstream tasks to achieve further improvement.
arXiv Detail & Related papers (2025-05-26T17:58:50Z)
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows [82.07367406991678]
Large Language Models (LLMs) have extended their impact beyond Natural Language Processing.<n>Among these, computer-using agents are capable of interacting with operating systems as humans do.<n>We introduce ScienceBoard, which encompasses a realistic, multi-domain environment featuring dynamic and visually rich scientific software.
arXiv Detail & Related papers (2025-05-26T12:27:27Z)
A Systematic Literature Review of Parameter-Efficient Fine-Tuning for Large Code Models [2.171120568435925]
Large Language Models (LLMs) for code require significant computational resources for training and fine-tuning. To address this, the research community has increasingly turned to Efficient Fine-Tuning (PEFT) PEFT enables the adaptation of large models by updating only a small subset of parameters, rather than the entire model. Our study synthesizes findings from 27 peer-reviewed papers, identifying patterns in configuration strategies and adaptation trade-offs.
arXiv Detail & Related papers (2025-04-29T16:19:25Z)
AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification [2.4515373478215343]
We introduce AiSciVision, a framework that specializes Large Multimodal Models (LMMs) into interactive research partners. Our framework uses two key components: Visual Retrieval-Augmented Generation (VisRAG) and domain-specific tools utilized in an agentic workflow. We evaluate AiSciVision on three real-world scientific image classification datasets: detecting the presence of aquaculture ponds, eelgrass, and solar panels.
arXiv Detail & Related papers (2024-10-28T19:35:47Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions. Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations. We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models. The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z)
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing. As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework. This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z)
Domain Generalization for Mammographic Image Analysis with Contrastive Learning [62.25104935889111]
The training of an efficacious deep learning model requires large data with diverse styles and qualities. A novel contrastive learning is developed to equip the deep learning models with better style generalization capability. The proposed method has been evaluated extensively and rigorously with mammograms from various vendor style domains and several public datasets.
arXiv Detail & Related papers (2023-04-20T11:40:21Z)
An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians. Recent studies have achieved promising results in automatic impression generation using large-scale medical text data. These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z)
AI-assisted Automated Workflow for Real-time X-ray Ptychography Data Analysis via Federated Resources [2.682578132719034]
We present an end-to-end automated workflow that uses large-scale remote compute resources and an embedded GPU platform at the edge to enable AI/ML-accelerated real-time analysis of data collected for x-ray ptychography.
arXiv Detail & Related papers (2023-04-09T19:11:04Z)
SEMPAI: a Self-Enhancing Multi-Photon Artificial Intelligence for prior-informed assessment of muscle function and pathology [48.54269377408277]
We introduce the Self-Enhancing Multi-Photon Artificial Intelligence (SEMPAI), that integrates hypothesis-driven priors in a data-driven Deep Learning approach. SEMPAI performs joint learning of several tasks to enable prediction for small datasets. SEMPAI outperforms state-of-the-art biomarkers in six of seven predictive tasks, including those with scarce data.
arXiv Detail & Related papers (2022-10-28T17:03:04Z)
Microscopy is All You Need [0.0]
We argue that a promising pathway for the development of machine learning methods is via the route of domain-specific deployable algorithms. This will benefit both fundamental physical studies and serve as a test bed for more complex autonomous systems such as robotics and manufacturing.
arXiv Detail & Related papers (2022-10-12T18:41:40Z)
A workflow for segmenting soil and plant X-ray CT images with deep learning in Googles Colaboratory [45.99558884106628]
We develop a modular workflow for applying convolutional neural networks to X-ray microCT images. We show how parameters can be optimized to achieve best results using example scans from walnut leaves, almond flower buds, and a soil aggregate.
arXiv Detail & Related papers (2022-03-18T00:47:32Z)
CheXstray: Real-time Multi-Modal Data Concordance for Drift Detection in Medical Imaging AI [1.359138408203412]
We build and test a medical imaging AI drift monitoring workflow that tracks data and model drift without contemporaneous ground truth. Key contributions include (1) proof-of-concept for medical imaging drift detection including use of VAE and domain specific statistical methods. This work has important implications for addressing the translation gap related to continuous medical imaging AI model monitoring in dynamic healthcare environments.
arXiv Detail & Related papers (2022-02-06T18:58:35Z)
A parameter refinement method for Ptychography based on Deep Learning concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data. We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration. We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.