Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line /
Microscope to Supercomputers
- URL: http://arxiv.org/abs/2308.13701v1
- Date: Fri, 25 Aug 2023 23:07:58 GMT
- Title: Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line /
Microscope to Supercomputers
- Authors: Alexander Brace, Rafael Vescovi, Ryan Chard, Nickolaus D. Saint,
Arvind Ramanathan, Nestor J. Zaluzec, Ian Foster
- Abstract summary: Dynamic PicoProbe at Argonne National Laboratory is undergoing upgrades that will enable it to produce up to 100s of GB of data per day.
While this data is highly important for both fundamental science and industrial applications, there is currently limited on-site infrastructure to handle these high-volume data streams.
We address this problem by providing a software architecture capable of supporting neighboring large-scale data transfers to neighboring supercomputers at the Argonne Leadership Computing Facility.
This infrastructure supports expected workloads and also provides domain scientists the ability to reinterrogate data from past experiments to yield additional scientific value and derive new insights.
- Score: 39.52789559084336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Dynamic PicoProbe at Argonne National Laboratory is undergoing upgrades
that will enable it to produce up to 100s of GB of data per day. While this
data is highly important for both fundamental science and industrial
applications, there is currently limited on-site infrastructure to handle these
high-volume data streams. We address this problem by providing a software
architecture capable of supporting large-scale data transfers to the
neighboring supercomputers at the Argonne Leadership Computing Facility. To
prepare for future scientific workflows, we implement two instructive use cases
for hyperspectral and spatiotemporal datasets, which include: (i) off-site data
transfer, (ii) machine learning/artificial intelligence and traditional data
analysis approaches, and (iii) automatic metadata extraction and cataloging of
experimental results. This infrastructure supports expected workloads and also
provides domain scientists the ability to reinterrogate data from past
experiments to yield additional scientific value and derive new insights.
Related papers
- Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research [90.91438597133211]
We introduce WarpSci, a framework designed to overcome crucial system bottlenecks in the application of reinforcement learning.
We eliminate the need for data transfer between the CPU and GPU, enabling the concurrent execution of thousands of simulations.
arXiv Detail & Related papers (2024-08-01T21:38:09Z) - Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future [130.87142103774752]
This review systematically assesses over seventy open-source autonomous driving datasets.
It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets.
It also delves into the scientific and technical challenges that warrant resolution.
arXiv Detail & Related papers (2023-12-06T10:46:53Z) - An IoT Cloud and Big Data Architecture for the Maintenance of Home
Appliances [0.0722732388409495]
This work introduces a distributed and scalable platform architecture that can be deployed for efficient big data collection and analytics.
The proposed system was tested with a case study for Predictive Maintenance of Home Appliances.
The experimental results demonstrated that the presented system could be advantageous for tackling real-world IoT scenarios in a cost-effective and local approach.
arXiv Detail & Related papers (2022-10-25T13:25:00Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - The MIT Supercloud Workload Classification Challenge [10.458111248130944]
In this paper, we present a workload classification challenge based on the MIT Supercloud dataset.
The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads.
arXiv Detail & Related papers (2022-04-12T14:28:04Z) - Deep Reinforcement Learning Assisted Federated Learning Algorithm for
Data Management of IIoT [82.33080550378068]
The continuous expanded scale of the industrial Internet of Things (IIoT) leads to IIoT equipments generating massive amounts of user data every moment.
How to manage these time series data in an efficient and safe way in the field of IIoT is still an open issue.
This paper studies the FL technology applications to manage IIoT equipment data in wireless network environments.
arXiv Detail & Related papers (2022-02-03T07:12:36Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Making Invisible Visible: Data-Driven Seismic Inversion with
Physics-Informed Data Augmentation [6.079137591620588]
We develop new physics-informed data augmentation techniques based on convolutional neural networks.
Specifically, our generative models leverage different physics knowledge (such as governing equations, observable perception, and physics phenomena) to improve the quality of the synthetic data.
We show that data-driven seismic imaging can be significantly enhanced by using our physics-informed data augmentation techniques.
arXiv Detail & Related papers (2021-06-22T15:59:44Z) - Bridge Data Center AI Systems with Edge Computing for Actionable
Information Retrieval [0.5652468989804973]
High data rates at modern synchrotron and X-ray free-electron lasers motivate the use of machine learning methods for data reduction, feature detection, and other purposes.
We describe here how specialized data center AI systems can be used for this purpose.
arXiv Detail & Related papers (2021-05-28T16:47:01Z) - Towards an Interpretable Data-driven Trigger System for High-throughput
Physics Facilities [7.939382824995354]
We introduce a new data-driven approach for designing high- throughput data filtering and trigger systems.
Our goal is to design a data-driven filtering system with a minimal run-time cost for determining which data event to keep.
We introduce key insights from interpretable predictive modeling and cost-sensitive learning in order to account for non-local inefficiencies in the current paradigm.
arXiv Detail & Related papers (2021-04-14T05:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.