Bayesian Optimization of Catalysts With In-context Learning
- URL: http://arxiv.org/abs/2304.05341v1
- Date: Tue, 11 Apr 2023 17:00:35 GMT
- Title: Bayesian Optimization of Catalysts With In-context Learning
- Authors: Mayk Caldas Ramos, Shane S. Michtavy, Marc D. Porosoff, Andrew D.
White
- Abstract summary: Large language models (LLMs) are able to do accurate classification with zero or only a few examples.
We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLMs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are able to do accurate classification with zero
or only a few examples (in-context learning). We show a prompting system that
enables regression with uncertainty for in-context learning with frozen LLM
(GPT-3, GPT-3.5, and GPT-4) models, allowing predictions without features or
architecture tuning. By incorporating uncertainty, our approach enables
Bayesian optimization for catalyst or molecule optimization using natural
language, eliminating the need for training or simulation. Here, we performed
the optimization using the synthesis procedure of catalysts to predict
properties. Working with natural language mitigates difficulty synthesizability
since the literal synthesis procedure is the model's input. We showed that
in-context learning could improve past a model context window (maximum number
of tokens the model can process at once) as data is gathered via example
selection, allowing the model to scale better. Although our method does not
outperform all baselines, it requires zero training, feature selection, and
minimal computing while maintaining satisfactory performance. We also find
Gaussian Process Regression on text embeddings is strong at Bayesian
optimization. The code is available in our GitHub repository:
https://github.com/ur-whitelab/BO-LIFT
Related papers
- Low-Redundant Optimization for Large Language Model Alignment [126.34547428473968]
Large language models (LLMs) are still struggling in aligning with human preference in complex tasks and scenarios.
We propose a low-redundant alignment method named textbfALLO, focusing on optimizing the most related neurons with the most useful supervised signals.
Experimental results on 10 datasets have shown the effectiveness of ALLO.
arXiv Detail & Related papers (2024-06-18T13:34:40Z) - LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP [20.86307407685542]
Linear Probe (LP) has been often reported as a weak baseline for few-shot CLIP adaptation.
In this work, we examine from convex-optimization perspectives a generalization of the standard LP baseline.
Our image-language objective function, along with these non-trivial optimization insights and ingredients, yields, surprisingly, highly competitive few-shot CLIP performances.
arXiv Detail & Related papers (2024-04-02T20:23:10Z) - Data-driven Prior Learning for Bayesian Optimisation [5.199765487172328]
We show that PLeBO and prior transfer find good inputs in fewer evaluations.
We validate the learned priors and compare to a breadth of transfer learning approaches.
We show that PLeBO and prior transfer find good inputs in fewer evaluations.
arXiv Detail & Related papers (2023-11-24T18:37:52Z) - AdaLomo: Low-memory Optimization with Adaptive Learning Rate [59.64965955386855]
We introduce low-memory optimization with adaptive learning rate (AdaLomo) for large language models.
AdaLomo results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
arXiv Detail & Related papers (2023-10-16T09:04:28Z) - Learning to Optimize Quasi-Newton Methods [22.504971951262004]
This paper introduces a novel machine learning called LODO, which tries to online meta-learn the best preconditioner during optimization.
Unlike other L2O methods, LODO does not require any meta-training on a training task distribution.
We show that our gradient approximates the inverse Hessian in noisy loss landscapes and is capable of representing a wide range of inverse Hessians.
arXiv Detail & Related papers (2022-10-11T03:47:14Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z) - Continuous Optimization Benchmarks by Simulation [0.0]
Benchmark experiments are required to test, compare, tune, and understand optimization algorithms.
Data from previous evaluations can be used to train surrogate models which are then used for benchmarking.
We show that the spectral simulation method enables simulation for continuous optimization problems.
arXiv Detail & Related papers (2020-08-14T08:50:57Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.