AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100
Labels
- URL: http://arxiv.org/abs/2208.14362v2
- Date: Sat, 25 Nov 2023 01:04:24 GMT
- Title: AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100
Labels
- Authors: Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer
Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic
Sala
- Abstract summary: We introduce AutoWS-Bench-101: a framework for evaluating automated WS techniques in challenging WS settings.
We ask whether a practitioner should use an AutoWS method to generate additional labels or use some simpler baselines.
We conclude with a thorough ablation study of AutoWS methods.
- Score: 23.849748213613452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weak supervision (WS) is a powerful method to build labeled datasets for
training supervised models in the face of little-to-no labeled data. It
replaces hand-labeling data with aggregating multiple noisy-but-cheap label
estimates expressed by labeling functions (LFs). While it has been used
successfully in many domains, weak supervision's application scope is limited
by the difficulty of constructing labeling functions for domains with complex
or high-dimensional features. To address this, a handful of methods have
proposed automating the LF design process using a small set of ground truth
labels. In this work, we introduce AutoWS-Bench-101: a framework for evaluating
automated WS (AutoWS) techniques in challenging WS settings -- a set of diverse
application domains on which it has been previously difficult or impossible to
apply traditional WS techniques. While AutoWS is a promising direction toward
expanding the application-scope of WS, the emergence of powerful methods such
as zero-shot foundation models reveals the need to understand how AutoWS
techniques compare or cooperate with modern zero-shot or few-shot learners.
This informs the central question of AutoWS-Bench-101: given an initial set of
100 labels for each task, we ask whether a practitioner should use an AutoWS
method to generate additional labels or use some simpler baseline, such as
zero-shot predictions from a foundation model or supervised learning. We
observe that in many settings, it is necessary for AutoWS methods to
incorporate signal from foundation models if they are to outperform simple
few-shot baselines, and AutoWS-Bench-101 promotes future research in this
direction. We conclude with a thorough ablation study of AutoWS methods.
Related papers
- Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks [19.49705185032905]
Weak supervision (WS) is a popular approach for label-efficient learning, leveraging diverse sources of noisy but inexpensive weak labels to automatically annotate training data.
Despite its wide usage, WS and its practical value are challenging to benchmark due to the many knobs in its setup.
We introduce a new benchmark, BOXWRENCH, designed to more accurately reflect real-world usages of WS.
arXiv Detail & Related papers (2025-01-13T22:29:31Z) - Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics.
Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features.
We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - Universal Self-Adaptive Prompting [60.67460565566514]
Universal Self-Adaptive Prompting (USP) is an automatic prompt design approach specifically tailored for zero-shot learning.
USP is highly versatile: to achieve universal prompting, USP categorizes a possible NLP task into one of the three possible task types.
We evaluate USP with PaLM and PaLM 2 models and demonstrate performances that are considerably stronger than standard zero-shot baselines.
arXiv Detail & Related papers (2023-05-24T09:09:48Z) - Large Language Models for Automated Data Science: Introducing CAAFE for
Context-Aware Automated Feature Engineering [52.09178018466104]
We introduce Context-Aware Automated Feature Engineering (CAAFE) to generate semantically meaningful features for datasets.
Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets.
We highlight the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML.
arXiv Detail & Related papers (2023-05-05T09:58:40Z) - AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph
Neural Networks [75.11008617118908]
AutoML techniques consider each task independently from scratch, leading to high computational cost.
Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the prior architectural design knowledge to the novel task of interest.
arXiv Detail & Related papers (2023-03-14T07:23:16Z) - AutoWS: Automated Weak Supervision Framework for Text Classification [1.748907524043535]
We propose a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts.
Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data.
arXiv Detail & Related papers (2023-02-07T07:12:05Z) - Automatic Synthesis of Diverse Weak Supervision Sources for Behavior
Analysis [37.077883083886114]
AutoSWAP is a framework for automatically synthesizing data-efficient task-level labeling functions.
We show that AutoSWAP is an effective way to automatically generate labeling functions that can significantly reduce expert effort for behavior analysis.
arXiv Detail & Related papers (2021-11-30T07:51:12Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z) - Named Entity Recognition without Labelled Data: A Weak Supervision
Approach [23.05371427663683]
This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision.
The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain.
A sequence labelling model can finally be trained on the basis of this unified annotation.
arXiv Detail & Related papers (2020-04-30T12:29:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.