LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from
Explanation
- URL: http://arxiv.org/abs/2004.07499v1
- Date: Thu, 16 Apr 2020 07:38:07 GMT
- Title: LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from
Explanation
- Authors: Dong-Ho Lee, Rahul Khanna, Bill Yuchen Lin, Jamin Chen, Seyeon Lee,
Qinyuan Ye, Elizabeth Boschee, Leonardo Neves, Xiang Ren
- Abstract summary: LEAN-LIFE is a web-based, Label-Efficient AnnotatioN framework for sequence labeling and classification tasks.
Our framework is the first to utilize this enhanced supervision technique and does so for three important tasks.
- Score: 40.72453599376169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Successfully training a deep neural network demands a huge corpus of labeled
data. However, each label only provides limited information to learn from and
collecting the requisite number of labels involves massive human effort. In
this work, we introduce LEAN-LIFE, a web-based, Label-Efficient AnnotatioN
framework for sequence labeling and classification tasks, with an easy-to-use
UI that not only allows an annotator to provide the needed labels for a task,
but also enables LearnIng From Explanations for each labeling decision. Such
explanations enable us to generate useful additional labeled data from
unlabeled instances, bolstering the pool of available training data. On three
popular NLP tasks (named entity recognition, relation extraction, sentiment
analysis), we find that using this enhanced supervision allows our models to
surpass competitive baseline F1 scores by more than 5-10 percentage points,
while using 2X times fewer labeled instances. Our framework is the first to
utilize this enhanced supervision technique and does so for three important
tasks -- thus providing improved annotation recommendations to users and an
ability to build datasets of (data, label, explanation) triples instead of the
regular (data, label) pair.
Related papers
- Substituting Data Annotation with Balanced Updates and Collective Loss
in Multi-label Text Classification [19.592985329023733]
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text.
We study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels.
Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph.
arXiv Detail & Related papers (2023-09-24T04:12:52Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - You Only Need One Thing One Click: Self-Training for Weakly Supervised
3D Scene Understanding [107.06117227661204]
We propose One Thing One Click'', meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our model can be compatible to 3D instance segmentation equipped with a point-clustering strategy.
arXiv Detail & Related papers (2023-03-26T13:57:00Z) - PointMatch: A Consistency Training Framework for Weakly Supervised
Semantic Segmentation of 3D Point Clouds [117.77841399002666]
We propose a novel framework, PointMatch, that stands on both data and label, by applying consistency regularization to sufficiently probe information from data itself.
The proposed PointMatch achieves the state-of-the-art performance under various weakly-supervised schemes on both ScanNet-v2 and S3DIS datasets.
arXiv Detail & Related papers (2022-02-22T07:26:31Z) - Meta-Learning for Multi-Label Few-Shot Classification [38.222736913855115]
This work targets the problem of multi-label meta-learning, where a model learns to predict multiple labels within a query.
We introduce a neural module to estimate the label count of a given sample by exploiting the relational inference.
Overall, our thorough experiments suggest that the proposed label-propagation algorithm in conjunction with the neural label count module (NLC) shall be considered as the method of choice.
arXiv Detail & Related papers (2021-10-26T08:47:48Z) - Learning with Different Amounts of Annotation: From Zero to Many Labels [19.869498599986006]
Training NLP systems typically assume access to annotated data that has a single human label per example.
We explore new annotation distribution schemes, assigning multiple labels per example for a small subset of training examples.
Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task.
arXiv Detail & Related papers (2021-09-09T16:48:41Z) - TagRuler: Interactive Tool for Span-Level Data Programming by
Demonstration [1.4050836886292872]
Data programming was only accessible to users who knew how to program.
We build a novel tool, TagRuler, that makes it easy for annotators to build span-level labeling functions without programming.
arXiv Detail & Related papers (2021-06-24T04:49:42Z) - One Thing One Click: A Self-Training Approach for Weakly Supervised 3D
Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.