XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science
- URL: http://arxiv.org/abs/2507.01054v1
- Date: Fri, 27 Jun 2025 21:45:56 GMT
- Title: XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science
- Authors: Jithendaraa Subramanian, Linda Hung, Daniel Schweigert, Santosh Suram, Weike Ye,
- Abstract summary: We propose a scalable framework that learns directly from elemental composition and X-ray diffraction (XRD)<n>Our architecture integrates modality-specific encoders with a cross-attention fusion module and is trained on the 5-million-sample Alexandria dataset.<n>Our results establish a path toward structure-free, experimentally grounded foundation models for materials science.
- Score: 0.27185251060695437
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in materials discovery have been driven by structure-based models, particularly those using crystal graphs. While effective for computational datasets, these models are impractical for real-world applications where atomic structures are often unknown or difficult to obtain. We propose a scalable multimodal framework that learns directly from elemental composition and X-ray diffraction (XRD) -- two of the more available modalities in experimental workflows without requiring crystal structure input. Our architecture integrates modality-specific encoders with a cross-attention fusion module and is trained on the 5-million-sample Alexandria dataset. We present masked XRD modeling (MXM), and apply MXM and contrastive alignment as self-supervised pretraining strategies. Pretraining yields faster convergence (up to 4.2x speedup) and improves both accuracy and representation quality. We further demonstrate that multimodal performance scales more favorably with dataset size than unimodal baselines, with gains compounding at larger data regimes. Our results establish a path toward structure-free, experimentally grounded foundation models for materials science.
Related papers
- evoxels: A differentiable physics framework for voxel-based microstructure simulations [41.94295877935867]
Differentiable physics framework evoxels is based on a fully Pythonic, unified voxel-based approach that integrates segmented 3D microscopy data, physical simulations, inverse modeling, and machine learning.
arXiv Detail & Related papers (2025-07-29T12:29:15Z) - DONUT: Physics-aware Machine Learning for Real-time X-ray Nanodiffraction Analysis [5.889405057118457]
We introduce DONUT, a physics-aware neural network designed for the rapid and automated analysis of nanobeam diffraction data.<n>By incorporating a differentiable geometric diffraction model directly into its architecture, DONUT learns to predict crystal lattice strain and orientation in real-time.<n>We demonstrate experimentally that DONUT accurately extracts all features within the data over 200 times more efficiently than conventional fitting methods.
arXiv Detail & Related papers (2025-07-18T16:10:39Z) - Dynamic Acoustic Model Architecture Optimization in Training for ASR [51.21112094223223]
DMAO is an architecture optimization framework that employs a grow-and-drop strategy to automatically reallocate parameters during training.<n>We evaluate DMAO through experiments with CTC onSpeech, TED-LIUM-v2 and Switchboard datasets.
arXiv Detail & Related papers (2025-06-16T07:47:34Z) - Spectra-to-Structure and Structure-to-Spectra Inference Across the Periodic Table [60.78615287040791]
XAStruct is a learning framework capable of both predicting XAS spectra from crystal structures and inferring local structural descriptors from XAS input.<n>XAStruct is trained on a large-scale dataset spanning over 70 elements across the periodic table.
arXiv Detail & Related papers (2025-06-13T15:58:05Z) - PolyMicros: Bootstrapping a Foundation Model for Polycrystalline Material Structure [2.030250820529959]
We introduce a novel machine learning approach for learning from hyper-sparse, complex spatial data in scientific domains.<n>Our core contribution is a physics-driven data augmentation scheme that leverages an ensemble of local generative models.<n>We utilize this framework to construct PolyMicros, the first Foundation Model for polycrystalline materials.
arXiv Detail & Related papers (2025-05-22T16:12:20Z) - UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion [61.690978792873196]
Existing approaches rely on either autoregressive sequence models or diffusion models.<n>We propose UniGenX, a unified framework that combines autoregressive next-token prediction with conditional diffusion models.<n>We validate the effectiveness of UniGenX on material and small molecule generation tasks.
arXiv Detail & Related papers (2025-03-09T16:43:07Z) - A new framework for X-ray absorption spectroscopy data analysis based on machine learning: XASDAML [3.26781102547109]
XASDAML is a flexible, machine learning based framework that integrates the entire data-processing workflow.<n>It supports comprehensive statistical analysis, leveraging methods such as principal component analysis and clustering.<n>The versatility and effectiveness of XASDAML are exemplified by its application to a copper dataset.
arXiv Detail & Related papers (2025-02-23T17:50:04Z) - UniMat: Unifying Materials Embeddings through Multi-modal Learning [0.0]
We evaluate techniques in multi-modal learning (alignment and fusion) in unifying some of the most important modalities in materials science.
We show that structure graph modality can be enhanced by aligning with XRD patterns.
We also show that aligning and fusing more experimentally accessible data formats, such as XRD patterns and compositions, can create more robust joint embeddings.
arXiv Detail & Related papers (2024-11-13T14:55:08Z) - Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts [54.529880848937104]
We develop a unified MLLM with the MoE architecture, named Uni-MoE, that can handle a wide array of modalities.
Specifically, it features modality-specific encoders with connectors for a unified multimodal representation.
We evaluate the instruction-tuned Uni-MoE on a comprehensive set of multimodal datasets.
arXiv Detail & Related papers (2024-05-18T12:16:01Z) - Scalable Diffusion for Materials Generation [99.71001883652211]
We develop a unified crystal representation that can represent any crystal structure (UniMat)
UniMat can generate high fidelity crystal structures from larger and more complex chemical systems.
We propose additional metrics for evaluating generative models of materials.
arXiv Detail & Related papers (2023-10-18T15:49:39Z) - LANISTR: Multimodal Learning from Structured and Unstructured Data [33.73687295669768]
LANISTR is an attention-based framework to learn from LANguage, Image, and STRuctured data.
In particular, we introduce a new similarity-based multimodal masking loss that enables it to learn cross-modal relations from large-scale multimodal data with missing modalities.
arXiv Detail & Related papers (2023-05-26T00:50:09Z) - Revealing the Invisible with Model and Data Shrinking for
Composite-database Micro-expression Recognition [49.463864096615254]
We analyze the influence of learning complexity, including the input complexity and model complexity.
We propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data.
We develop three parameter-free modules to integrate with RCN without increasing any learnable parameters.
arXiv Detail & Related papers (2020-06-17T06:19:24Z) - Intelligent multiscale simulation based on process-guided composite
database [0.0]
We present an integrated data-driven modeling framework based on process modeling, material homogenization, and machine learning.
We are interested in the injection-molded short fiber reinforced composites, which have been identified as key material systems in automotive, aerospace, and electronics industries.
arXiv Detail & Related papers (2020-03-20T20:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.