Reward driven discovery of the optimal microstructure representations with invariant variational autoencoders
- URL: http://arxiv.org/abs/2510.00243v1
- Date: Tue, 30 Sep 2025 20:15:42 GMT
- Title: Reward driven discovery of the optimal microstructure representations with invariant variational autoencoders
- Authors: Boris N. Slautin, Kamyar Barakati, Hiroshi Funakubo, Maxim A. Ziatdinov, Vladimir V. Shvartsman, Doru C. Lupascu, Sergei V. Kalinin,
- Abstract summary: Variational Autoencoders (VAEs) provide a powerful means of constructing such low-dimensional representations.<n>VAEs are often optimized through trial-and-error and empirical analysis.<n>We investigated reward-based strategies for evaluating latent space representations.
- Score: 0.015295722752489374
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Microscopy techniques generate vast amounts of complex image data that in principle can be used to discover simpler, interpretable, and parsimonious forms to reveal the underlying physical structures, such as elementary building blocks in molecular systems or order parameters and phases in crystalline materials. Variational Autoencoders (VAEs) provide a powerful means of constructing such low-dimensional representations, but their performance heavily depends on multiple non-myopic design choices, which are often optimized through trial-and-error and empirical analysis. To enable automated and unbiased optimization of VAE workflows, we investigated reward-based strategies for evaluating latent space representations. Using Piezoresponse Force Microscopy data as a model system, we examined multiple policies and reward functions that can serve as a foundation for automated optimization. Our analysis shows that approximating the latent space with Gaussian Mixture Models (GMM) and Bayesian Gaussian Mixture Models (BGMM) provides a strong basis for constructing reward functions capable of estimating model efficiency and guiding the search for optimal parsimonious representations.
Related papers
- Generative Multi-Objective Bayesian Optimization with Scalable Batch Evaluations for Sample-Efficient De Novo Molecular Design [1.8517039579627974]
This work introduces an alternative, modular "generate-then-optimize" framework for de novo molecular design/discovery.<n>We benchmark the framework against state-of-the-art latent-space and discrete molecular optimization methods.<n>Specifically, in a case study related to sustainable energy storage, we show that our approach quickly uncovers novel, diverse, and high-performing organic (quinone-based) cathode materials.
arXiv Detail & Related papers (2025-12-19T14:59:27Z) - Statistically controllable microstructure reconstruction framework for heterogeneous materials using sliced-Wasserstein metric and neural networks [6.011061228715799]
Heterogeneous porous materials play a crucial role in various engineering systems.<n>We propose a statistically controllable microstructure reconstruction framework that integrates neural networks with sliced-Wasserstein metric.<n>Our method can perform and controllable reconstruction tasks even with small sample sizes.
arXiv Detail & Related papers (2025-11-18T09:02:09Z) - SAM$^{*}$: Task-Adaptive SAM with Physics-Guided Rewards [0.5805874695844994]
Image segmentation is a critical task in microscopy, essential for accurately analyzing and interpreting complex visual data.<n>Here, we introduce a reward function-based optimization to fine-tune foundational models.<n>We demonstrate the effectiveness of this approach in microscopy imaging, where precise segmentation is crucial for analyzing cellular structures, material interfaces, and nanoscale features.
arXiv Detail & Related papers (2025-09-08T13:51:20Z) - High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Physics-based reward driven image analysis in microscopy [5.581609660066545]
We present a methodology based on the concept of a Reward Function to optimize image analysis dynamically.
The Reward Function is engineered to closely align with the experimental objectives and broader context.
We extend the reward function approach towards the identification of partially-disordered regions, creating a physics-driven reward function and action space of high-dimensional clustering.
arXiv Detail & Related papers (2024-04-22T12:55:04Z) - Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization.
We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons.
Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z) - Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference [47.460898983429374]
We introduce an ensemble Kalman filter (EnKF) into the non-mean-field (NMF) variational inference framework to approximate the posterior distribution of the latent states.
This novel marriage between EnKF and GPSSM not only eliminates the need for extensive parameterization in learning variational distributions, but also enables an interpretable, closed-form approximation of the evidence lower bound (ELBO)
We demonstrate that the resulting EnKF-aided online algorithm embodies a principled objective function by ensuring data-fitting accuracy while incorporating model regularizations to mitigate overfitting.
arXiv Detail & Related papers (2023-12-10T15:22:30Z) - A Pareto-optimal compositional energy-based model for sampling and
optimization of protein sequences [55.25331349436895]
Deep generative models have emerged as a popular machine learning-based approach for inverse problems in the life sciences.
These problems often require sampling new designs that satisfy multiple properties of interest in addition to learning the data distribution.
arXiv Detail & Related papers (2022-10-19T19:04:45Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.