Related papers: Deep Kernel Methods Learn Better: From Cards to Process Optimization

Deep Kernel Methods Learn Better: From Cards to Process Optimization

URL: http://arxiv.org/abs/2303.14554v2
Date: Tue, 19 Sep 2023 13:53:34 GMT
Title: Deep Kernel Methods Learn Better: From Cards to Process Optimization
Authors: Mani Valleti, Rama K. Vasudevan, Maxim A. Ziatdinov, Sergei V. Kalinin
Abstract summary: We show that DKL with active learning can produce a more compact and smooth latent space. We demonstrate this behavior using a simple cards data set and extend it to the optimization of domain-generated trajectories in physical systems.
Score: 0.7587345054583298
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability of deep learning methods to perform classification and regression tasks relies heavily on their capacity to uncover manifolds in high-dimensional data spaces and project them into low-dimensional representation spaces. In this study, we investigate the structure and character of the manifolds generated by classical variational autoencoder (VAE) approaches and deep kernel learning (DKL). In the former case, the structure of the latent space is determined by the properties of the input data alone, while in the latter, the latent manifold forms as a result of an active learning process that balances the data distribution and target functionalities. We show that DKL with active learning can produce a more compact and smooth latent space which is more conducive to optimization compared to previously reported methods, such as the VAE. We demonstrate this behavior using a simple cards data set and extend it to the optimization of domain-generated trajectories in physical systems. Our findings suggest that latent manifolds constructed through active learning have a more beneficial structure for optimization problems, especially in feature-rich target-poor scenarios that are common in domain sciences, such as materials synthesis, energy storage, and molecular discovery. The jupyter notebooks that encapsulate the complete analysis accompany the article.

Related papers

Deep learning-aided inverse design of porous metamaterials [0.0]
The ultimate aim of the study is to explore the inverse design of porous metamaterials using a deep learning-based generative framework.<n>We develop a property-variational autoencoder (pVAE), a variational autoencoder (VAE) augmented with a regressor, to generate structured metamaterials with tailored hydraulic properties.
arXiv Detail & Related papers (2025-07-23T20:07:53Z)
Into the Void: Mapping the Unseen Gaps in High Dimensional Data [23.226089369715016]
We present a comprehensive pipeline, augmented by a visual analytics system named GapMiner'' It is aimed at exploring and exploiting untapped opportunities within the empty areas of high-dimensional datasets.
arXiv Detail & Related papers (2025-01-25T16:57:21Z)
Equation discovery framework EPDE: Towards a better equation discovery [50.79602839359522]
We enhance the EPDE algorithm -- an evolutionary optimization-based discovery framework. Our approach generates terms using fundamental building blocks such as elementary functions and individual differentials. We validate our algorithm's noise resilience and overall performance by comparing its results with those from the state-of-the-art equation discovery framework SINDy.
arXiv Detail & Related papers (2024-12-28T15:58:44Z)
Pullback Flow Matching on Data Manifolds [10.187244125099479]
Pullback Flow Matching (PFM) is a framework for generative modeling on data manifold. We demonstrate PFM's effectiveness through applications in synthetic, data dynamics and protein sequence data, generating novel proteins with specific properties. This method shows strong potential for drug discovery and materials science, where generating novel samples with specific properties is of great interest.
arXiv Detail & Related papers (2024-10-06T16:41:26Z)
Understanding active learning of molecular docking and its applications [0.6554326244334868]
We investigate how active learning methodologies effectively predict docking scores using only 2D structures. Our findings suggest that surrogate models tend to memorize structural patterns prevalent in high docking scored compounds. Our comprehensive analysis underscores the reliability and potential applicability of active learning methodologies in virtual screening campaigns.
arXiv Detail & Related papers (2024-06-14T05:43:42Z)
Scalable manifold learning by uniform landmark sampling and constrained locally linear embedding [0.6144680854063939]
We propose a scalable manifold learning (scML) method that can manipulate large-scale and high-dimensional data in an efficient manner. We empirically validated the effectiveness of scML on synthetic datasets and real-world benchmarks of different types. scML scales well with increasing data sizes and embedding dimensions, and exhibits promising performance in preserving the global structure.
arXiv Detail & Related papers (2024-01-02T08:43:06Z)
A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings. Results show that our method outperforms existing state of the art in preserving ground truth manifold distances. We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z)
Optimization of a Hydrodynamic Computational Reservoir through Evolution [58.720142291102135]
We interface with a model of a hydrodynamic system, under development by a startup, as a computational reservoir. We optimized the readout times and how inputs are mapped to the wave amplitude or frequency using an evolutionary search algorithm. Applying evolutionary methods to this reservoir system substantially improved separability on an XNOR task, in comparison to implementations with hand-selected parameters.
arXiv Detail & Related papers (2023-04-20T19:15:02Z)
Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z)
Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data. We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel. We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z)
Measuring dissimilarity with diffeomorphism invariance [94.02751799024684]
We introduce DID, a pairwise dissimilarity measure applicable to a wide range of data spaces. We prove that DID enjoys properties which make it relevant for theoretical study and practical use.
arXiv Detail & Related papers (2022-02-11T13:51:30Z)
High-Dimensional Bayesian Optimisation with Variational Autoencoders and Deep Metric Learning [119.91679702854499]
We introduce a method based on deep metric learning to perform Bayesian optimisation over high-dimensional, structured input spaces. We achieve such an inductive bias using just 1% of the available labelled data. As an empirical contribution, we present state-of-the-art results on real-world high-dimensional black-box optimisation problems.
arXiv Detail & Related papers (2021-06-07T13:35:47Z)
Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics [21.95240820041655]
Variational Autos (VAEs) are generative models in which encoder-decoder network pairs are trained to reconstruct training data distributions. We propose a method for measuring how well the latent space of deep generative models is able to encode structural and chemical features.
arXiv Detail & Related papers (2020-10-18T13:33:02Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.