Transport Novelty Distance: A Distributional Metric for Evaluating Material Generative Models
- URL: http://arxiv.org/abs/2512.09514v1
- Date: Wed, 10 Dec 2025 10:38:58 GMT
- Title: Transport Novelty Distance: A Distributional Metric for Evaluating Material Generative Models
- Authors: Paul Hagemann, Simon Müller, Janine George, Philipp Benner,
- Abstract summary: We introduce the Transport Novelty Distance (TNovD) to judge generative models used for materials discovery jointly by the quality and novelty of the generated materials.<n>Based on ideas from Optimal Transport theory, TNovD uses a coupling between the features of the training and generated sets, which is refined into a quality and memorization regime by a threshold.<n>We evaluate our proposed metric on typical toy experiments relevant for crystal structure prediction, including memorization, noise injection and lattice deformations.
- Score: 2.5779675962411654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in generative machine learning have opened new possibilities for the discovery and design of novel materials. However, as these models become more sophisticated, the need for rigorous and meaningful evaluation metrics has grown. Existing evaluation approaches often fail to capture both the quality and novelty of generated structures, limiting our ability to assess true generative performance. In this paper, we introduce the Transport Novelty Distance (TNovD) to judge generative models used for materials discovery jointly by the quality and novelty of the generated materials. Based on ideas from Optimal Transport theory, TNovD uses a coupling between the features of the training and generated sets, which is refined into a quality and memorization regime by a threshold. The features are generated from crystal structures using a graph neural network that is trained to distinguish between materials, their augmented counterparts, and differently sized supercells using contrastive learning. We evaluate our proposed metric on typical toy experiments relevant for crystal structure prediction, including memorization, noise injection and lattice deformations. Additionally, we validate the TNovD on the MP20 validation set and the WBM substitution dataset, demonstrating that it is capable of detecting both memorized and low-quality material data. We also benchmark the performance of several popular material generative models. While introduced for materials, our TNovD framework is domain-agnostic and can be adapted for other areas, such as images and molecules.
Related papers
- LeMat-GenBench: A Unified Evaluation Framework for Crystal Generative Models [39.63407613127808]
We introduce LeMat-GenBench, a unified benchmark for generative models of crystalline materials.<n>We release an open-source evaluation suite and a public leaderboard on Hugging Face, and benchmark 12 recent generative models.
arXiv Detail & Related papers (2025-12-04T08:25:16Z) - Design Topological Materials by Reinforcement Fine-Tuned Generative Model [4.529476797684622]
Topological insulators (TIs) and topological crystalline insulators (TCIs) are materials with unconventional electronic properties.<n>We focus on the generation of new topological materials through a generative model.<n>We apply reinforcement fine-tuning to a pre-trained generative model, thereby aligning the model's objectives with our material design goals.
arXiv Detail & Related papers (2025-04-17T16:05:24Z) - MaskTerial: A Foundation Model for Automated 2D Material Flake Detection [48.73213960205105]
We present a deep learning model, called MaskTerial, that uses an instance segmentation network to reliably identify 2D material flakes.<n>The model is extensively pre-trained using a synthetic data generator, that generates realistic microscopy images from unlabeled data.<n>We demonstrate significant improvements over existing techniques in the detection of low-contrast materials such as hexagonal boron nitride.
arXiv Detail & Related papers (2024-12-12T15:01:39Z) - A Generative Machine Learning Model for Material Microstructure 3D
Reconstruction and Performance Evaluation [4.169915659794567]
The dimensional extension from 2D to 3D is viewed as a highly challenging inverse problem from the current technological perspective.
A novel generative model that integrates the multiscale properties of U-net with and the generative capabilities of GAN has been proposed.
The model's accuracy is further improved by combining the image regularization loss with the Wasserstein distance loss.
arXiv Detail & Related papers (2024-02-24T13:42:34Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Graph Contrastive Learning for Materials [6.667711415870472]
We introduce CrystalCLR, a framework for constrastive learning of representations with crystal graph neural networks.
With the addition of a novel loss function, our framework is able to learn representations competitive with engineered fingerprinting methods.
We also demonstrate that via model finetuning, contrastive pretraining can improve the performance of graph neural networks for prediction of material properties.
arXiv Detail & Related papers (2022-11-24T04:15:47Z) - A Binded VAE for Inorganic Material Generation [0.0]
We develop an original Binded-VAE model dedicated to the generation of discrete datasets with high sparsity.
We show on a real issue of rubber compound design that the proposed approach outperforms the standard generative models.
arXiv Detail & Related papers (2021-12-17T15:24:28Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.