Learning to extrapolate using continued fractions: Predicting the
critical temperature of superconductor materials
- URL: http://arxiv.org/abs/2012.03774v3
- Date: Mon, 10 Jul 2023 06:38:22 GMT
- Title: Learning to extrapolate using continued fractions: Predicting the
critical temperature of superconductor materials
- Authors: Pablo Moscato, Mohammad Nazmul Haque, Kevin Huang, Julia Sloan, Jon C.
de Oliveira
- Abstract summary: In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(mathbfx)$ is a common objective.
We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $mathbfx$.
- Score: 5.905364646955811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the field of Artificial Intelligence (AI) and Machine Learning (ML), the
approximation of unknown target functions $y=f(\mathbf{x})$ using limited
instances $S={(\mathbf{x^{(i)}},y^{(i)})}$, where $\mathbf{x^{(i)}} \in D$ and
$D$ represents the domain of interest, is a common objective. We refer to $S$
as the training set and aim to identify a low-complexity mathematical model
that can effectively approximate this target function for new instances
$\mathbf{x}$. Consequently, the model's generalization ability is evaluated on
a separate set $T=\{\mathbf{x^{(j)}}\} \subset D$, where $T \neq S$, frequently
with $T \cap S = \emptyset$, to assess its performance beyond the training set.
However, certain applications require accurate approximation not only within
the original domain $D$ but also in an extended domain $D'$ that encompasses
$D$. This becomes particularly relevant in scenarios involving the design of
new structures, where minimizing errors in approximations is crucial. For
example, when developing new materials through data-driven approaches, the
AI/ML system can provide valuable insights to guide the design process by
serving as a surrogate function. Consequently, the learned model can be
employed to facilitate the design of new laboratory experiments. In this paper,
we propose a method for multivariate regression based on iterative fitting of a
continued fraction, incorporating additive spline models. We compare the
performance of our method with established techniques, including AdaBoost,
Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression,
Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent, and
XGBoost. To evaluate these methods, we focus on an important problem in the
field: predicting the critical temperature of superconductors based on
physical-chemical characteristics.
Related papers
- Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations [40.77319247558742]
We study the computational complexity of learning a target function $f_*:mathbbRdtomathbbR$ with additive structure.
We prove that a large subset of $f_*$ can be efficiently learned by gradient training of a two-layer neural network.
arXiv Detail & Related papers (2024-06-17T17:59:17Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Agnostic Active Learning of Single Index Models with Linear Sample Complexity [27.065175036001246]
We study active learning methods for single index models of the form $F(mathbf x) = f(langle mathbf w, mathbf xrangle)$.
arXiv Detail & Related papers (2024-05-15T13:11:28Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing [8.723136784230906]
We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets.
Our objective is to minimize the cumulative deviation of the generated parameters $thetai(t)_t=0T$ across all $T$ iterations.
By leveraging symmetries within the regret-optimal algorithm, we develop a nearly regret $_optimal that runs with $mathcalO(Np2)$ fewer elementary operations.
arXiv Detail & Related papers (2023-09-08T19:17:03Z) - High-dimensional Asymptotics of Feature Learning: How One Gradient Step
Improves the Representation [89.21686761957383]
We study the first gradient descent step on the first-layer parameters $boldsymbolW$ in a two-layer network.
Our results demonstrate that even one step can lead to a considerable advantage over random features.
arXiv Detail & Related papers (2022-05-03T12:09:59Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - Fast Approximate Multi-output Gaussian Processes [6.6174748514131165]
Training with the proposed approach requires computing only a $N times n$ eigenfunction matrix and a $n times n$ inverse where $n$ is a selected number of eigenvalues.
The proposed method can regress over multiple outputs, estimate the derivative of the regressor of any order, and learn the correlations between them.
arXiv Detail & Related papers (2020-08-22T14:34:45Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Statistical-Query Lower Bounds via Functional Gradients [19.5924910463796]
We show that any statistical-query algorithm with tolerance $n- (1/epsilon)b$ must use at least $2nc epsilon$ queries for some constant $b.
Our results rule out general (as opposed to correlational) SQ learning algorithms, which is unusual for real-valued learning problems.
arXiv Detail & Related papers (2020-06-29T05:15:32Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.