Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision
- URL: http://arxiv.org/abs/2210.02724v2
- Date: Sun, 9 Oct 2022 08:27:47 GMT
- Title: Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision
- Authors: Jieyu Zhang, Linxin Song, Alexander Ratner
- Abstract summary: Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
- Score: 75.1860418333995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to
synthesize training labels efficiently. The core component of PWS is the label
model, which infers true labels by aggregating the outputs of multiple noisy
supervision sources abstracted as labeling functions (LFs). Existing
statistical label models typically rely only on the outputs of LF, ignoring the
instance features when modeling the underlying generative process. In this
paper, we attempt to incorporate the instance features into a statistical label
model via the proposed FABLE. In particular, it is built on a mixture of
Bayesian label models, each corresponding to a global pattern of correlation,
and the coefficients of the mixture components are predicted by a Gaussian
Process classifier based on instance features. We adopt an auxiliary
variable-based variational inference algorithm to tackle the non-conjugate
issue between the Gaussian Process and Bayesian label models. Extensive
empirical comparison on eleven benchmark datasets sees FABLE achieving the
highest averaged performance across nine baselines.
Related papers
- Adaptive Collaborative Correlation Learning-based Semi-Supervised Multi-Label Feature Selection [25.195711274756334]
We propose an Adaptive Collaborative Correlation lEarning-based Semi-Supervised Multi-label Feature Selection (Access-MFS) method to address these issues.
Specifically, a generalized regression model equipped with an extended uncorrelated constraint is introduced to select discriminative yet irrelevant features.
The correlation instance and label correlation are integrated into the proposed regression model to adaptively learn both the sample similarity graph and the label similarity graph.
arXiv Detail & Related papers (2024-06-18T01:47:38Z) - Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Fusing Conditional Submodular GAN and Programmatic Weak Supervision [5.300742881753571]
Programmatic Weak Supervision (PWS) and generative models serve as crucial tools to maximize the utility of existing datasets without resorting to data gathering and manual annotation processes.
PWS uses various weak supervision techniques to estimate the underlying class labels of data, while generative models primarily concentrate on sampling from the underlying distribution of the given dataset.
Recently, WSGAN proposed a mechanism to fuse these two models.
arXiv Detail & Related papers (2023-12-16T07:49:13Z) - Deep Partial Multi-Label Learning with Graph Disambiguation [27.908565535292723]
We propose a novel deep Partial multi-Label model with grAph-disambIguatioN (PLAIN)
Specifically, we introduce the instance-level and label-level similarities to recover label confidences.
At each training epoch, labels are propagated on the instance and label graphs to produce relatively accurate pseudo-labels.
arXiv Detail & Related papers (2023-05-10T04:02:08Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - Partial sequence labeling with structured Gaussian Processes [8.239028141030621]
We propose structured Gaussian Processes for partial sequence labeling.
It encodes uncertainty in the prediction and does not need extra effort for model selection and hyper parameter learning.
It is evaluated on several sequence labeling tasks and the experimental results show the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-20T00:56:49Z) - Active Learning by Feature Mixing [52.16150629234465]
We propose a novel method for batch active learning called ALFA-Mix.
We identify unlabelled instances with sufficiently-distinct features by seeking inconsistencies in predictions.
We show that inconsistencies in these predictions help discovering features that the model is unable to recognise in the unlabelled instances.
arXiv Detail & Related papers (2022-03-14T12:20:54Z) - AggMatch: Aggregating Pseudo Labels for Semi-Supervised Learning [25.27527138880104]
Semi-supervised learning has proven to be an effective paradigm for leveraging a huge amount of unlabeled data.
We introduce AggMatch, which refines initial pseudo labels by using different confident instances.
We conduct experiments to demonstrate the effectiveness of AggMatch over the latest methods on standard benchmarks.
arXiv Detail & Related papers (2022-01-25T16:41:54Z) - Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition [55.362258027878966]
We present momentum pseudo-labeling (MPL) as a simple yet effective strategy for semi-supervised speech recognition.
MPL consists of a pair of online and offline models that interact and learn from each other, inspired by the mean teacher method.
The experimental results demonstrate that MPL effectively improves over the base model and is scalable to different semi-supervised scenarios.
arXiv Detail & Related papers (2021-06-16T16:24:55Z) - Instance-Aware Graph Convolutional Network for Multi-Label
Classification [55.131166957803345]
Graph convolutional neural network (GCN) has effectively boosted the multi-label image recognition task.
We propose an instance-aware graph convolutional neural network (IA-GCN) framework for multi-label classification.
arXiv Detail & Related papers (2020-08-19T12:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.