Universalizing Weak Supervision
- URL: http://arxiv.org/abs/2112.03865v3
- Date: Wed, 29 Nov 2023 18:11:49 GMT
- Title: Universalizing Weak Supervision
- Authors: Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts,
Frederic Sala
- Abstract summary: We propose a universal technique that enables weak supervision over any label type.
We apply this technique to important problems previously not tackled by WS frameworks including learning to rank, regression, and learning in hyperbolic space.
- Score: 18.832796698152492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weak supervision (WS) frameworks are a popular way to bypass hand-labeling
large datasets for training data-hungry models. These approaches synthesize
multiple noisy but cheaply-acquired estimates of labels into a set of
high-quality pseudolabels for downstream training. However, the synthesis
technique is specific to a particular kind of label, such as binary labels or
sequences, and each new label type requires manually designing a new synthesis
algorithm. Instead, we propose a universal technique that enables weak
supervision over any label type while still offering desirable properties,
including practical flexibility, computational efficiency, and theoretical
guarantees. We apply this technique to important problems previously not
tackled by WS frameworks including learning to rank, regression, and learning
in hyperbolic space. Theoretically, our synthesis approach produces a
consistent estimators for learning some challenging but important
generalizations of the exponential family model. Experimentally, we validate
our framework and show improvement over baselines in diverse settings including
real-world learning-to-rank and regression problems along with learning on
hyperbolic manifolds.
Related papers
- Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot
Filling [2.6056468338837457]
Slot filling poses a critical challenge to handle a novel domain whose samples are never seen during training.
Most prior works deal with this problem in a two-pass pipeline manner based on metric learning.
We propose a new adaptive end-to-end metric learning scheme for the challenging zero-shot slot filling.
arXiv Detail & Related papers (2023-10-23T19:01:16Z) - Robust Representation Learning for Unreliable Partial Label Learning [86.909511808373]
Partial Label Learning (PLL) is a type of weakly supervised learning where each training instance is assigned a set of candidate labels, but only one label is the ground-truth.
This is known as Unreliable Partial Label Learning (UPLL) that introduces an additional complexity due to the inherent unreliability and ambiguity of partial labels.
We propose the Unreliability-Robust Representation Learning framework (URRL) that leverages unreliability-robust contrastive learning to help the model fortify against unreliable partial labels effectively.
arXiv Detail & Related papers (2023-08-31T13:37:28Z) - A Benchmark Generative Probabilistic Model for Weak Supervised Learning [2.0257616108612373]
Weak Supervised Learning approaches have been developed to alleviate the annotation burden.
We show that latent variable models (PLVMs) achieve state-of-the-art performance across four datasets.
arXiv Detail & Related papers (2023-03-31T07:06:24Z) - AutoWS: Automated Weak Supervision Framework for Text Classification [1.748907524043535]
We propose a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts.
Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data.
arXiv Detail & Related papers (2023-02-07T07:12:05Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Synbols: Probing Learning Algorithms with Synthetic Datasets [112.45883250213272]
Synbols is a tool for rapidly generating new datasets with a rich composition of latent features rendered in low resolution images.
Our tool's high-level interface provides a language for rapidly generating new distributions on the latent features.
To showcase the versatility of Synbols, we use it to dissect the limitations and flaws in standard learning algorithms in various learning setups.
arXiv Detail & Related papers (2020-09-14T13:03:27Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Pseudo-Representation Labeling Semi-Supervised Learning [0.0]
In recent years, semi-supervised learning has shown tremendous success in leveraging unlabeled data to improve the performance of deep learning models.
This work proposes the pseudo-representation labeling, a simple and flexible framework that utilizes pseudo-labeling techniques to iteratively label a small amount of unlabeled data and use them as training data.
Compared with the existing approaches, the pseudo-representation labeling is more intuitive and can effectively solve practical problems in the real world.
arXiv Detail & Related papers (2020-05-31T03:55:41Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.