ExPT: Synthetic Pretraining for Few-Shot Experimental Design
- URL: http://arxiv.org/abs/2310.19961v1
- Date: Mon, 30 Oct 2023 19:25:43 GMT
- Title: ExPT: Synthetic Pretraining for Few-Shot Experimental Design
- Authors: Tung Nguyen, Sudhanshu Agrawal, Aditya Grover
- Abstract summary: Experiment Pretrained Transformers (ExPT) is a foundation model for few-shot experimental design.
ExPT employs a novel combination of synthetic pretraining with in-context learning.
We evaluate ExPT on few-shot experimental design in challenging domains.
- Score: 33.5918976228562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Experimental design is a fundamental problem in many science and engineering
fields. In this problem, sample efficiency is crucial due to the time, money,
and safety costs of real-world design evaluations. Existing approaches either
rely on active data collection or access to large, labeled datasets of past
experiments, making them impractical in many real-world scenarios. In this
work, we address the more challenging yet realistic setting of few-shot
experimental design, where only a few labeled data points of input designs and
their corresponding values are available. We approach this problem as a
conditional generation task, where a model conditions on a few labeled examples
and the desired output to generate an optimal input design. To this end, we
introduce Experiment Pretrained Transformers (ExPT), a foundation model for
few-shot experimental design that employs a novel combination of synthetic
pretraining with in-context learning. In ExPT, we only assume knowledge of a
finite collection of unlabelled data points from the input domain and pretrain
a transformer neural network to optimize diverse synthetic functions defined
over this domain. Unsupervised pretraining allows ExPT to adapt to any design
task at test time in an in-context fashion by conditioning on a few labeled
data points from the target task and generating the candidate optima. We
evaluate ExPT on few-shot experimental design in challenging domains and
demonstrate its superior generality and performance compared to existing
methods. The source code is available at https://github.com/tung-nd/ExPT.git.
Related papers
- AExGym: Benchmarks and Environments for Adaptive Experimentation [7.948144726705323]
We present a benchmark for adaptive experimentation based on real-world datasets.
We highlight prominent practical challenges to operationalizing adaptivity: non-stationarity, batched/delayed feedback, multiple outcomes and objectives, and external validity.
arXiv Detail & Related papers (2024-08-08T15:32:12Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient [52.2669490431145]
PropEn is inspired by'matching', which enables implicit guidance without training a discriminator.
We show that training with a matched dataset approximates the gradient of the property of interest while remaining within the data distribution.
arXiv Detail & Related papers (2024-05-28T11:30:19Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Towards Efficient Fine-tuning of Pre-trained Code Models: An
Experimental Study and Beyond [52.656743602538825]
Fine-tuning pre-trained code models incurs a large computational cost.
We conduct an experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning.
We propose Telly to efficiently fine-tune pre-trained code models via layer freezing.
arXiv Detail & Related papers (2023-04-11T13:34:13Z) - Meta Input: How to Leverage Off-the-Shelf Deep Neural Networks [29.975937981538664]
We introduce a novel approach that allows end-users to exploit pretrained DNN models in their own testing environment without modifying the models.
We present a textitmeta input which is an additional input transforming the distribution of testing data to be aligned with that of training data.
As a result, end-users can exploit well-trained models in their own testing environment which can differ from the training environment.
arXiv Detail & Related papers (2022-10-21T02:11:38Z) - Domain Adaptation with Pre-trained Transformers for Query Focused
Abstractive Text Summarization [18.791701342934605]
The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on a given query.
A key challenge in addressing this task is the lack of large labeled data for training the summarization model.
We address this challenge by exploring a series of domain adaptation techniques.
arXiv Detail & Related papers (2021-12-22T05:34:56Z) - Reinforcement Learning based Sequential Batch-sampling for Bayesian
Optimal Experimental Design [1.6249267147413522]
Sequential design of experiments (SDOE) is a popular suite of methods, that has yielded promising results in recent years.
In this work, we aim to extend the SDOE strategy, to query the experiment or computer code at a batch of inputs.
A unique capability of the proposed methodology is its ability to be applied to multiple tasks, for example optimization of a function, once its trained.
arXiv Detail & Related papers (2021-12-21T02:25:23Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.