SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data
- URL: http://arxiv.org/abs/2210.02989v2
- Date: Fri, 7 Oct 2022 04:07:50 GMT
- Title: SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
Synthetic Data
- Authors: Ching-Yun Ko, Pin-Yu Chen, Jeet Mohapatra, Payel Das, Luca Daniel
- Abstract summary: Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning.
This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
- Score: 78.21197488065177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent success in fine-tuning large models, that are pretrained on broad data
at scale, on downstream tasks has led to a significant paradigm shift in deep
learning, from task-centric model design to task-agnostic representation
learning and task-specific fine-tuning. As the representations of pretrained
models are used as a foundation for different downstream tasks, this paper
proposes a new task-agnostic framework, \textit{SynBench}, to measure the
quality of pretrained representations using synthetic data. We set up a
reference by a theoretically-derived robustness-accuracy tradeoff of the class
conditional Gaussian mixture. Given a pretrained model, the representations of
data synthesized from the Gaussian mixture are used to compare with our
reference to infer the quality. By comparing the ratio of area-under-curve
between the raw data and their representations, SynBench offers a quantifiable
score for robustness-accuracy performance benchmarking. Our framework applies
to a wide range of pretrained models taking continuous data inputs and is
independent of the downstream tasks and datasets. Evaluated with several
pretrained vision transformer models, the experimental results show that our
SynBench score well matches the actual linear probing performance of the
pre-trained model when fine-tuned on downstream tasks. Moreover, our framework
can be used to inform the design of robust linear probing on pretrained
representations to mitigate the robustness-accuracy tradeoff in downstream
tasks.
Related papers
- ImageNet-RIB Benchmark: Large Pre-Training Datasets Don't Guarantee Robustness after Fine-Tuning [30.422932548359952]
We introduce a new robust fine-tuning benchmark, ImageNet-RIB (Robustness Inheritance Benchmark)
The benchmark consists of related but distinct specialized (downstream) tasks.
We find that the continual learning methods, EWC and LwF maintain robustness after fine-tuning.
arXiv Detail & Related papers (2024-10-28T22:33:22Z) - DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation [57.11544252399801]
We propose DaWin, a training-free dynamic weight method that leverages the entropy of individual models over each unlabeled test sample.
Unlike previous works that typically rely on additional training to learn such coefficients, our approach requires no training.
Results demonstrate that DaWin achieves significant performance gain in considered settings, with minimal computational overhead.
arXiv Detail & Related papers (2024-10-03T16:25:35Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Feedback-guided Data Synthesis for Imbalanced Classification [10.836265321046561]
We introduce a framework for augmenting static datasets with useful synthetic samples.
We find that the samples must be close to the support of the real data of the task at hand, and be sufficiently diverse.
On ImageNet-LT, we achieve state-of-the-art results, with over 4 percent improvement on underrepresented classes.
arXiv Detail & Related papers (2023-09-29T21:47:57Z) - Too Fine or Too Coarse? The Goldilocks Composition of Data Complexity
for Robust Left-Right Eye-Tracking Classifiers [0.0]
We train machine learning models utilizing a mixed dataset composed of both fine- and coarse-grain data.
For our purposes, finer-grain data refers to data collected using more complex methods whereas coarser-grain data refers to data collected using more simple methods.
arXiv Detail & Related papers (2022-08-24T23:18:08Z) - Task2Sim : Towards Effective Pre-training and Transfer from Synthetic
Data [74.66568380558172]
We study the transferability of pre-trained models based on synthetic data generated by graphics simulators to downstream tasks.
We introduce Task2Sim, a unified model mapping downstream task representations to optimal simulation parameters.
It learns this mapping by training to find the set of best parameters on a set of "seen" tasks.
Once trained, it can then be used to predict best simulation parameters for novel "unseen" tasks in one shot.
arXiv Detail & Related papers (2021-11-30T19:25:27Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Deep Ensembles for Low-Data Transfer Learning [21.578470914935938]
We study different ways of creating ensembles from pre-trained models.
We show that the nature of pre-training itself is a performant source of diversity.
We propose a practical algorithm that efficiently identifies a subset of pre-trained models for any downstream dataset.
arXiv Detail & Related papers (2020-10-14T07:59:00Z) - Do Adversarially Robust ImageNet Models Transfer Better? [102.09335596483695]
adversarially robust models often perform better than their standard-trained counterparts when used for transfer learning.
Our results are consistent with (and in fact, add to) recent hypotheses stating that robustness leads to improved feature representations.
arXiv Detail & Related papers (2020-07-16T17:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.