The Weak Supervision Landscape
- URL: http://arxiv.org/abs/2203.16282v1
- Date: Wed, 30 Mar 2022 13:19:43 GMT
- Title: The Weak Supervision Landscape
- Authors: Rafael Poyiadzi, Daniel Bacaicoa-Barber, Jesus Cid-Sueiro, Miquel
Perello-Nieto, Peter Flach, Raul Santos-Rodriguez
- Abstract summary: We propose a framework for categorising weak supervision settings.
We identify the key elements that characterise weak supervision and devise a series of dimensions that categorise most of the existing approaches.
We show how common settings in the literature fit within the framework and discuss its possible uses in practice.
- Score: 5.186945902380689
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many ways of annotating a dataset for machine learning classification tasks
that go beyond the usual class labels exist in practice. These are of interest
as they can simplify or facilitate the collection of annotations, while not
greatly affecting the resulting machine learning model. Many of these fall
under the umbrella term of weak labels or annotations. However, it is not
always clear how different alternatives are related. In this paper we propose a
framework for categorising weak supervision settings with the aim of: (1)
helping the dataset owner or annotator navigate through the available options
within weak supervision when prescribing an annotation process, and (2)
describing existing annotations for a dataset to machine learning practitioners
so that we allow them to understand the implications for the learning process.
To this end, we identify the key elements that characterise weak supervision
and devise a series of dimensions that categorise most of the existing
approaches. We show how common settings in the literature fit within the
framework and discuss its possible uses in practice.
Related papers
- Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Assisted Text Annotation Using Active Learning to Achieve High Quality
with Little Effort [9.379650501033465]
We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations.
We combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories.
Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even complex and subtle frames.
arXiv Detail & Related papers (2021-12-15T13:14:58Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - OPAD: An Optimized Policy-based Active Learning Framework for Document
Content Analysis [6.159771892460152]
We propose textitOPAD, a novel framework using reinforcement policy for active learning in content detection tasks for documents.
The framework learns the acquisition function to decide the samples to be selected while optimizing performance metrics.
We show superior performance of the proposed textitOPAD framework for active learning for various tasks related to document understanding.
arXiv Detail & Related papers (2021-10-01T07:40:56Z) - Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations.
We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - OLALA: Object-Level Active Learning for Efficient Document Layout
Annotation [24.453873808984415]
We propose an Object-Level Active Learning framework for efficient document layout.
In this framework, only regions with the most ambiguous object predictions within an image are selected for annotators to label.
For unselected predictions, the semi-automatic correction algorithm is proposed to identify certain errors based on prior knowledge of layout structures.
arXiv Detail & Related papers (2020-10-05T03:48:07Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.