Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
- URL: http://arxiv.org/abs/2002.11955v2
- Date: Wed, 15 Jul 2020 12:45:52 GMT
- Title: Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
- Authors: Daniel Y. Fu, Mayee F. Chen, Frederic Sala, Sarah M. Hooper, Kayvon
Fatahalian, Christopher R\'e
- Abstract summary: Weak supervision is a popular method for building machine learning models without relying on ground truth annotations.
Existing approaches use latent variable estimation to model the noisy sources.
We show that for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters.
We use this insight to build FlyingSquid, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches.
- Score: 24.190587751595455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weak supervision is a popular method for building machine learning models
without relying on ground truth annotations. Instead, it generates
probabilistic training labels by estimating the accuracies of multiple noisy
labeling sources (e.g., heuristics, crowd workers). Existing approaches use
latent variable estimation to model the noisy sources, but these methods can be
computationally expensive, scaling superlinearly in the data. In this work, we
show that, for a class of latent variable models highly applicable to weak
supervision, we can find a closed-form solution to model parameters, obviating
the need for iterative solutions like stochastic gradient descent (SGD). We use
this insight to build FlyingSquid, a weak supervision framework that runs
orders of magnitude faster than previous weak supervision approaches and
requires fewer assumptions. In particular, we prove bounds on generalization
error without assuming that the latent variable model can exactly parameterize
the underlying data distribution. Empirically, we validate FlyingSquid on
benchmark weak supervision datasets and find that it achieves the same or
higher quality compared to previous approaches without the need to tune an SGD
procedure, recovers model parameters 170 times faster on average, and enables
new video analysis and online learning applications.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Towards Learning Stochastic Population Models by Gradient Descent [0.0]
We show that simultaneous estimation of parameters and structure poses major challenges for optimization procedures.
We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty.
arXiv Detail & Related papers (2024-04-10T14:38:58Z) - Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Generative Modeling Helps Weak Supervision (and Vice Versa) [87.62271390571837]
We propose a model fusing weak supervision and generative adversarial networks.
It captures discrete variables in the data alongside the weak supervision derived label estimate.
It is the first approach to enable data augmentation through weakly supervised synthetic images and pseudolabels.
arXiv Detail & Related papers (2022-03-22T20:24:21Z) - End-to-End Weak Supervision [15.125993628007972]
We propose an end-to-end approach for directly learning the downstream model.
We show improved performance over prior work in terms of end model performance on downstream test sets.
arXiv Detail & Related papers (2021-07-05T19:10:11Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.