Algebraic Machine Learning with an Application to Chemistry
- URL: http://arxiv.org/abs/2205.05795v4
- Date: Thu, 22 Feb 2024 14:55:01 GMT
- Title: Algebraic Machine Learning with an Application to Chemistry
- Authors: Ezzeddine El Sai, Parker Gara, Markus J. Pflaum
- Abstract summary: We develop a machine learning pipeline that captures fine-grain geometric information without relying on smoothness assumptions.
In particular, we propose a for numerically detecting points lying near the singular locus of the underlying variety.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As datasets used in scientific applications become more complex, studying the
geometry and topology of data has become an increasingly prevalent part of the
data analysis process. This can be seen for example with the growing interest
in topological tools such as persistent homology. However, on the one hand,
topological tools are inherently limited to providing only coarse information
about the underlying space of the data. On the other hand, more geometric
approaches rely predominately on the manifold hypothesis, which asserts that
the underlying space is a smooth manifold. This assumption fails for many
physical models where the underlying space contains singularities.
In this paper we develop a machine learning pipeline that captures fine-grain
geometric information without having to rely on any smoothness assumptions. Our
approach involves working within the scope of algebraic geometry and algebraic
varieties instead of differential geometry and smooth manifolds. In the setting
of the variety hypothesis, the learning problem becomes to find the underlying
variety using sample data. We cast this learning problem into a Maximum A
Posteriori optimization problem which we solve in terms of an eigenvalue
computation. Having found the underlying variety, we explore the use of
Gr\"obner bases and numerical methods to reveal information about its geometry.
In particular, we propose a heuristic for numerically detecting points lying
near the singular locus of the underlying variety.
Related papers
- Persistent de Rham-Hodge Laplacians in Eulerian representation for manifold topological learning [7.0103981121698355]
We introduce persistent de Rham-Hodge Laplacian, or persistent Hodge Laplacian, for manifold topological learning.
Our PHLs are constructed in the Eulerian representation via structure-persevering Cartesian grids.
As a proof-of-principle application, we consider the prediction of protein-ligand binding affinities with two benchmark datasets.
arXiv Detail & Related papers (2024-08-01T01:15:52Z) - Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning.
We introduce a novel approach to disentangled representation learning based on quadratic optimal transport.
We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z) - Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure.
We introduce a new class of manifold, named soft manifold, that can solve this situation.
Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z) - Unraveling the Single Tangent Space Fallacy: An Analysis and Clarification for Applying Riemannian Geometry in Robot Learning [6.253089330116833]
Handling geometric constraints effectively requires the incorporation of tools from differential geometry into the formulation of machine learning methods.
Recent adoption in robot learning has been largely characterized by a mathematically-flawed simplification.
This paper provides a theoretical elucidation of various misconceptions surrounding this approach and offers experimental evidence of its shortcomings.
arXiv Detail & Related papers (2023-10-11T21:16:01Z) - Neural Latent Geometry Search: Product Manifold Inference via
Gromov-Hausdorff-Informed Bayesian Optimization [21.97865037637575]
We mathematically define this novel formulation and coin it as neural latent geometry search (NLGS)
We propose a novel notion of distance between candidate latent geometries based on the Gromov-Hausdorff distance from metric geometry.
We then design a graph search space based on the notion of smoothness between latent geometries and employ the calculated as an additional inductive bias.
arXiv Detail & Related papers (2023-09-09T14:29:22Z) - Symmetry-Informed Geometric Representation for Molecules, Proteins, and
Crystalline Materials [66.14337835284628]
We propose a platform, coined Geom3D, which enables benchmarking the effectiveness of geometric strategies.
Geom3D contains 16 advanced symmetry-informed geometric representation models and 14 geometric pretraining methods over 46 diverse datasets.
arXiv Detail & Related papers (2023-06-15T05:37:25Z) - Exploring Data Geometry for Continual Learning [64.4358878435983]
We study continual learning from a novel perspective by exploring data geometry for the non-stationary stream of data.
Our method dynamically expands the geometry of the underlying space to match growing geometric structures induced by new data.
Experiments show that our method achieves better performance than baseline methods designed in Euclidean space.
arXiv Detail & Related papers (2023-04-08T06:35:25Z) - Study of Manifold Geometry using Multiscale Non-Negative Kernel Graphs [32.40622753355266]
We propose a framework to study the geometric structure of the data.
We make use of our recently introduced non-negative kernel (NNK) regression graphs to estimate the point density, intrinsic dimension, and the linearity of the data manifold (curvature)
arXiv Detail & Related papers (2022-10-31T17:01:17Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z) - Quadric hypersurface intersection for manifold learning in feature space [52.83976795260532]
manifold learning technique suitable for moderately high dimension and large datasets.
The technique is learned from the training data in the form of an intersection of quadric hypersurfaces.
At test time, this manifold can be used to introduce an outlier score for arbitrary new points.
arXiv Detail & Related papers (2021-02-11T18:52:08Z) - Computational Analysis of Deformable Manifolds: from Geometric Modelling
to Deep Learning [0.0]
We will show that the diversity of non-flat spaces provides a rich area of study.
We will explore geometric methods for shape processing and data analysis.
arXiv Detail & Related papers (2020-09-03T16:50:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.