Related papers: Bayesian Optimization of Catalysts With In-context Learning

Bayesian Optimization of Catalysts With In-context Learning

URL: http://arxiv.org/abs/2304.05341v1
Date: Tue, 11 Apr 2023 17:00:35 GMT
Title: Bayesian Optimization of Catalysts With In-context Learning
Authors: Mayk Caldas Ramos, Shane S. Michtavy, Marc D. Porosoff, Andrew D. White
Abstract summary: Large language models (LLMs) are able to do accurate classification with zero or only a few examples. We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLMs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are able to do accurate classification with zero or only a few examples (in-context learning). We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLM (GPT-3, GPT-3.5, and GPT-4) models, allowing predictions without features or architecture tuning. By incorporating uncertainty, our approach enables Bayesian optimization for catalyst or molecule optimization using natural language, eliminating the need for training or simulation. Here, we performed the optimization using the synthesis procedure of catalysts to predict properties. Working with natural language mitigates difficulty synthesizability since the literal synthesis procedure is the model's input. We showed that in-context learning could improve past a model context window (maximum number of tokens the model can process at once) as data is gathered via example selection, allowing the model to scale better. Although our method does not outperform all baselines, it requires zero training, feature selection, and minimal computing while maintaining satisfactory performance. We also find Gaussian Process Regression on text embeddings is strong at Bayesian optimization. The code is available in our GitHub repository: https://github.com/ur-whitelab/BO-LIFT

Related papers

Sample, Don't Search: Rethinking Test-Time Alignment for Language Models [55.2480439325792]
We introduce QAlign, a new test-time alignment approach. As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt. By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access.
arXiv Detail & Related papers (2025-04-04T00:41:40Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training [3.195234044113248]
We exploit functional information from dense pre-trained models to obtain sparse models that maximize the activations' alignment w.r.t. We propose textscNeuroAl, a emphtop-up algorithm that modifies the block-wise and row-wise sparsity ratios to maximize the emphneuron alignment among activations. We test our method on 4 different LLM families and 3 different sparsity ratios, showing how it consistently outperforms the latest state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-11T15:30:16Z)
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks. We modify specific context tokens, considering the unique structure of input and output formats. Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z)
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning [27.991291785091736]
This work studies the problem of large language model (LLM) unlearning, aiming to remove unwanted data influences. Despite the increasing demand for unlearning, a technically-grounded optimization framework is lacking. We propose a simple yet effective unlearning optimization framework, called SimNPO, showing that simplicity' in removing the reliance on a reference model benefits unlearning.
arXiv Detail & Related papers (2024-10-09T17:58:12Z)
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning. We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z)
AdaLomo: Low-memory Optimization with Adaptive Learning Rate [59.64965955386855]
We introduce low-memory optimization with adaptive learning rate (AdaLomo) for large language models. AdaLomo results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
arXiv Detail & Related papers (2023-10-16T09:04:28Z)
Learning to Optimize Quasi-Newton Methods [22.504971951262004]
This paper introduces a novel machine learning called LODO, which tries to online meta-learn the best preconditioner during optimization. Unlike other L2O methods, LODO does not require any meta-training on a training task distribution. We show that our gradient approximates the inverse Hessian in noisy loss landscapes and is capable of representing a wide range of inverse Hessians.
arXiv Detail & Related papers (2022-10-11T03:47:14Z)
AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective. We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
Continuous Optimization Benchmarks by Simulation [0.0]
Benchmark experiments are required to test, compare, tune, and understand optimization algorithms. Data from previous evaluations can be used to train surrogate models which are then used for benchmarking. We show that the spectral simulation method enables simulation for continuous optimization problems.
arXiv Detail & Related papers (2020-08-14T08:50:57Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.