Optimal radial basis for density-based atomic representations
- URL: http://arxiv.org/abs/2105.08717v1
- Date: Tue, 18 May 2021 17:57:08 GMT
- Title: Optimal radial basis for density-based atomic representations
- Authors: Alexander Goscinski, F\'elix Musil, Sergey Pozdnyakov, and Michele
Ceriotti
- Abstract summary: We discuss how to build an adaptive, optimal numerical basis that is chosen to represent most efficiently the structural diversity of the dataset at hand.
For each training dataset, this optimal basis is unique, and can be computed at no additional cost with respect to the primitive basis.
We demonstrate that this construction yields representations that are accurate and computationally efficient.
- Score: 58.720142291102135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The input of almost every machine learning algorithm targeting the properties
of matter at the atomic scale involves a transformation of the list of
Cartesian atomic coordinates into a more symmetric representation. Many of
these most popular representations can be seen as an expansion of the
symmetrized correlations of the atom density, and differ mainly by the choice
of basis. Here we discuss how to build an adaptive, optimal numerical basis
that is chosen to represent most efficiently the structural diversity of the
dataset at hand. For each training dataset, this optimal basis is unique, and
can be computed at no additional cost with respect to the primitive basis by
approximating it with splines. We demonstrate that this construction yields
representations that are accurate and computationally efficient, presenting
examples that involve both molecular and condensed-phase machine-learning
models.
Related papers
- Quantization of Large Language Models with an Overdetermined Basis [73.79368761182998]
We introduce an algorithm for data quantization based on the principles of Kashin representation.
Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance.
arXiv Detail & Related papers (2024-04-15T12:38:46Z) - Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing [28.91482208876914]
We consider the problem of parameter estimation in a high-dimensional generalized linear model.
Despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured designs.
arXiv Detail & Related papers (2023-08-28T11:49:23Z) - Wigner kernels: body-ordered equivariant machine learning without a
basis [0.0]
We propose a novel density-based method which involves computing Wigner kernels''
Wigner kernels are fully equivariant and body-ordered kernels that can be computed iteratively with a cost that is independent of the radial-chemical basis.
We present several examples of the accuracy of models based on Wigner kernels in chemical applications.
arXiv Detail & Related papers (2023-03-07T18:34:55Z) - Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs.
We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon.
Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z) - Tensor-reduced atomic density representations [0.0]
Graph neural networks escape scaling by mapping chemical element information into a fixed dimensional space in a learnable way.
We recast this approach as tensor factorisation by exploiting the tensor structure of standard neighbour density based descriptors.
In doing so, we form compact tensor-reduced representations whose size does not depend on the number of chemical elements.
arXiv Detail & Related papers (2022-10-02T01:08:50Z) - A smooth basis for atomistic machine learning [0.0]
We investigate the basis that results from the solution of the Laplacian eigenvalue problem within a sphere around the atom of interest.
We show that this generates the smoothest possible basis of a given size within the sphere.
We consider several unsupervised metrics of the quality of a basis for a given dataset, and show that the Laplacian eigenstate basis has a performance that is much better than some widely used basis sets.
arXiv Detail & Related papers (2022-09-05T13:00:51Z) - Eigen Analysis of Self-Attention and its Reconstruction from Partial
Computation [58.80806716024701]
We study the global structure of attention scores computed using dot-product based self-attention.
We find that most of the variation among attention scores lie in a low-dimensional eigenspace.
We propose to compute scores only for a partial subset of token pairs, and use them to estimate scores for the remaining pairs.
arXiv Detail & Related papers (2021-06-16T14:38:42Z) - Building powerful and equivariant graph neural networks with structural
message-passing [74.93169425144755]
We propose a powerful and equivariant message-passing framework based on two ideas.
First, we propagate a one-hot encoding of the nodes, in addition to the features, in order to learn a local context matrix around each node.
Second, we propose methods for the parametrization of the message and update functions that ensure permutation equivariance.
arXiv Detail & Related papers (2020-06-26T17:15:16Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.