Related papers: The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science

The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science

URL: http://arxiv.org/abs/2210.17484v1
Date: Mon, 31 Oct 2022 17:11:36 GMT
Title: The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science
Authors: Santiago Miret, Kin Long Kelvin Lee, Carmelo Gonzales, Marcel Nassar, Matthew Spellings
Abstract summary: The Open MatSci ML Toolkit is a flexible, self-contained, and scalable Python-based framework to apply deep learning models and methods on scientific data. By publishing and sharing this toolkit with the research community via open-source release, we hope to:. Lower the entry barrier for new machine learning researchers and practitioners that want to get started with the OpenCatalyst dataset.
Score: 3.577720074630756
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present the Open MatSci ML Toolkit: a flexible, self-contained, and scalable Python-based framework to apply deep learning models and methods on scientific data with a specific focus on materials science and the OpenCatalyst Dataset. Our toolkit provides: 1. A scalable machine learning workflow for materials science leveraging PyTorch Lightning, which enables seamless scaling across different computation capabilities (laptop, server, cluster) and hardware platforms (CPU, GPU, XPU). 2. Deep Graph Library (DGL) support for rapid graph neural network prototyping and development. By publishing and sharing this toolkit with the research community via open-source release, we hope to: 1. Lower the entry barrier for new machine learning researchers and practitioners that want to get started with the OpenCatalyst dataset, which presently comprises the largest computational materials science dataset. 2. Enable the scientific community to apply advanced machine learning tools to high-impact scientific challenges, such as modeling of materials behavior for clean energy applications. We demonstrate the capabilities of our framework by enabling three new equivariant neural network models for multiple OpenCatalyst tasks and arrive at promising results for compute scaling and model performance.

Related papers

LEMUR Neural Network Dataset: Towards Seamless AutoML [34.04248949660201]
We introduce LEMUR, an open source dataset of neural network models with well-structured code for diverse architectures. LEMUR is primarily designed to enable fine-tuning of large language models for automated machine learning tasks. LEMUR will be released as an open source project under the MIT license upon acceptance of the paper.
arXiv Detail & Related papers (2025-04-14T09:08:00Z)
asanAI: In-Browser, No-Code, Offline-First Machine Learning Toolkit [0.0]
asanAI is an offline-first, open-source, no-code machine learning toolkit designed for users of all skill levels. It allows individuals to design, debug, train, and test ML models directly in a web browser. The toolkit runs on any device with a modern web browser, including smartphones, and ensures user privacy through local computations.
arXiv Detail & Related papers (2025-01-07T12:47:52Z)
Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model [50.37090759139591]
Large language models (LLMs) have been widely applied in various practical applications, typically comprising billions of parameters. The human brain, employing bio-plausible spiking mechanisms, can accomplish the same tasks while significantly reducing energy consumption. We are releasing a software toolkit named DarwinKit (Darkit) to accelerate the adoption of brain-inspired large language models.
arXiv Detail & Related papers (2024-12-20T07:50:08Z)
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents [55.37173845836839]
OS-Atlas is a foundational GUI action model that excels at GUI grounding and OOD agentic tasks. We are releasing the largest open-source cross-platform GUI grounding corpus to date, which contains over 13 million GUI elements.
arXiv Detail & Related papers (2024-10-30T17:10:19Z)
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models [3.865029260331255]
We present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models. OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity. Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard.
arXiv Detail & Related papers (2024-10-16T17:48:34Z)
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models [61.14336781917986]
We introduce OpenR, an open-source framework for enhancing the reasoning capabilities of large language models (LLMs) OpenR unifies data acquisition, reinforcement learning training, and non-autoregressive decoding into a cohesive software platform. Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning.
arXiv Detail & Related papers (2024-10-12T23:42:16Z)
Deep Fast Machine Learning Utils: A Python Library for Streamlined Machine Learning Prototyping [0.0]
The Deep Fast Machine Learning Utils (DFMLU) library provides tools designed to automate and enhance aspects of machine learning processes. DFMLU offers functionalities that support model development and data handling. This manuscript presents an overview of DFMLU's functionalities, providing Python examples for each tool.
arXiv Detail & Related papers (2024-09-14T21:39:17Z)
NNsight and NDIF: Democratizing Access to Foundation Model Internals [48.27939917017487]
NNsight is an open-source Python package with a simple, flexible API that can express interventions on any PyTorch model by building graphs. NDIF is a collaborative research platform providing researchers access to foundation-scale LLMs via the NNsight API.
arXiv Detail & Related papers (2024-07-18T17:59:01Z)
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models [89.63342806812413]
We present an open-source toolkit for evaluating large multi-modality models based on PyTorch. VLMEvalKit implements over 70 different large multi-modality models, including both proprietary APIs and open-source models. We host OpenVLM Leaderboard to track the progress of multi-modality learning research.
arXiv Detail & Related papers (2024-07-16T13:06:15Z)
M$^2$Hub: Unlocking the Potential of Machine Learning for Materials Discovery [26.099381363351668]
We introduce M$2$Hub, a toolkit for advancing machine learning in materials discovery. M$2$Hub will enable easy access to materials discovery tasks, datasets, machine learning methods, evaluations, and benchmark results.
arXiv Detail & Related papers (2023-06-14T23:06:36Z)
Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models. The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z)
Flashlight: Enabling Innovation in Tools for Machine Learning [50.63188263773778]
We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems. We see Flashlight as a tool enabling research that can benefit widely used libraries downstream and bring machine learning and systems researchers closer together.
arXiv Detail & Related papers (2022-01-29T01:03:29Z)
On the impact of selected modern deep-learning techniques to the performance and celerity of classification models in an experimental high-energy physics use case [0.0]
Deep learning techniques are tested in the context of a classification problem encountered in the domain of high-energy physics. The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models. A new wrapper library for PyTorch called LUMIN is presented, which incorporates all of the techniques studied.
arXiv Detail & Related papers (2020-02-03T12:29:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.