The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in
Materials Science
- URL: http://arxiv.org/abs/2210.17484v1
- Date: Mon, 31 Oct 2022 17:11:36 GMT
- Title: The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in
Materials Science
- Authors: Santiago Miret, Kin Long Kelvin Lee, Carmelo Gonzales, Marcel Nassar,
Matthew Spellings
- Abstract summary: The Open MatSci ML Toolkit is a flexible, self-contained, and scalable Python-based framework to apply deep learning models and methods on scientific data.
By publishing and sharing this toolkit with the research community via open-source release, we hope to:.
Lower the entry barrier for new machine learning researchers and practitioners that want to get started with the OpenCatalyst dataset.
- Score: 3.577720074630756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the Open MatSci ML Toolkit: a flexible, self-contained, and
scalable Python-based framework to apply deep learning models and methods on
scientific data with a specific focus on materials science and the OpenCatalyst
Dataset. Our toolkit provides: 1. A scalable machine learning workflow for
materials science leveraging PyTorch Lightning, which enables seamless scaling
across different computation capabilities (laptop, server, cluster) and
hardware platforms (CPU, GPU, XPU). 2. Deep Graph Library (DGL) support for
rapid graph neural network prototyping and development. By publishing and
sharing this toolkit with the research community via open-source release, we
hope to: 1. Lower the entry barrier for new machine learning researchers and
practitioners that want to get started with the OpenCatalyst dataset, which
presently comprises the largest computational materials science dataset. 2.
Enable the scientific community to apply advanced machine learning tools to
high-impact scientific challenges, such as modeling of materials behavior for
clean energy applications. We demonstrate the capabilities of our framework by
enabling three new equivariant neural network models for multiple OpenCatalyst
tasks and arrive at promising results for compute scaling and model
performance.
Related papers
- OS-ATLAS: A Foundation Action Model for Generalist GUI Agents [55.37173845836839]
OS-Atlas is a foundational GUI action model that excels at GUI grounding and OOD agentic tasks.
We are releasing the largest open-source cross-platform GUI grounding corpus to date, which contains over 13 million GUI elements.
arXiv Detail & Related papers (2024-10-30T17:10:19Z) - Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models [3.865029260331255]
We present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models.
OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity.
Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard.
arXiv Detail & Related papers (2024-10-16T17:48:34Z) - OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models [61.14336781917986]
We introduce OpenR, an open-source framework for enhancing the reasoning capabilities of large language models (LLMs)
OpenR unifies data acquisition, reinforcement learning training, and non-autoregressive decoding into a cohesive software platform.
Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning.
arXiv Detail & Related papers (2024-10-12T23:42:16Z) - Deep Fast Machine Learning Utils: A Python Library for Streamlined Machine Learning Prototyping [0.0]
The Deep Fast Machine Learning Utils (DFMLU) library provides tools designed to automate and enhance aspects of machine learning processes.
DFMLU offers functionalities that support model development and data handling.
This manuscript presents an overview of DFMLU's functionalities, providing Python examples for each tool.
arXiv Detail & Related papers (2024-09-14T21:39:17Z) - NNsight and NDIF: Democratizing Access to Foundation Model Internals [48.27939917017487]
NNsight is an open-source Python package with a simple, flexible API that can express interventions on any PyTorch model by building graphs.
NDIF is a collaborative research platform providing researchers access to foundation-scale LLMs via the NNsight API.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models [89.63342806812413]
We present an open-source toolkit for evaluating large multi-modality models based on PyTorch.
VLMEvalKit implements over 70 different large multi-modality models, including both proprietary APIs and open-source models.
We host OpenVLM Leaderboard to track the progress of multi-modality learning research.
arXiv Detail & Related papers (2024-07-16T13:06:15Z) - M$^2$Hub: Unlocking the Potential of Machine Learning for Materials
Discovery [26.099381363351668]
We introduce M$2$Hub, a toolkit for advancing machine learning in materials discovery.
M$2$Hub will enable easy access to materials discovery tasks, datasets, machine learning methods, evaluations, and benchmark results.
arXiv Detail & Related papers (2023-06-14T23:06:36Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - Flashlight: Enabling Innovation in Tools for Machine Learning [50.63188263773778]
We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems.
We see Flashlight as a tool enabling research that can benefit widely used libraries downstream and bring machine learning and systems researchers closer together.
arXiv Detail & Related papers (2022-01-29T01:03:29Z) - On the impact of selected modern deep-learning techniques to the
performance and celerity of classification models in an experimental
high-energy physics use case [0.0]
Deep learning techniques are tested in the context of a classification problem encountered in the domain of high-energy physics.
The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models.
A new wrapper library for PyTorch called LUMIN is presented, which incorporates all of the techniques studied.
arXiv Detail & Related papers (2020-02-03T12:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.