More Embeddings, Better Sequence Labelers?
- URL: http://arxiv.org/abs/2009.08330v3
- Date: Wed, 2 Jun 2021 03:09:58 GMT
- Title: More Embeddings, Better Sequence Labelers?
- Authors: Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei
Huang, Kewei Tu
- Abstract summary: Recent work proposes a family of contextual embeddings that significantly improves the accuracy of sequence labelers over non-contextual embeddings.
We conduct extensive experiments on 3 tasks over 18 datasets and 8 languages to study the accuracy of sequence labeling with various embedding concatenations.
- Score: 75.44925576268052
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent work proposes a family of contextual embeddings that significantly
improves the accuracy of sequence labelers over non-contextual embeddings.
However, there is no definite conclusion on whether we can build better
sequence labelers by combining different kinds of embeddings in various
settings. In this paper, we conduct extensive experiments on 3 tasks over 18
datasets and 8 languages to study the accuracy of sequence labeling with
various embedding concatenations and make three observations: (1) concatenating
more embedding variants leads to better accuracy in rich-resource and
cross-domain settings and some conditions of low-resource settings; (2)
concatenating additional contextual sub-word embeddings with contextual
character embeddings hurts the accuracy in extremely low-resource settings; (3)
based on the conclusion of (1), concatenating additional similar contextual
embeddings cannot lead to further improvements. We hope these conclusions can
help people build stronger sequence labelers in various settings.
Related papers
- Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic
Representations [102.05351905494277]
Sub-sentence encoder is a contrastively-learned contextual embedding model for fine-grained semantic representation of text.
We show that sub-sentence encoders keep the same level of inference cost and space complexity compared to sentence encoders.
arXiv Detail & Related papers (2023-11-07T20:38:30Z) - Imbalanced Multi-label Classification for Business-related Text with
Moderately Large Label Spaces [0.30458514384586394]
We evaluated four different methods for multi label text classification using a specific imbalanced business dataset.
Fine tuned BERT outperforms the other three methods by a significant margin, achieving high values of accuracy.
These findings highlight the effectiveness of fine tuned BERT for multi label text classification tasks, and suggest that it may be a useful tool for businesses.
arXiv Detail & Related papers (2023-06-12T11:51:50Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Label Dependencies-aware Set Prediction Networks for Multi-label Text Classification [0.0]
We leverage Graph Convolutional Networks and construct an adjacency matrix based on the statistical relations between labels.
We enhance recall ability by applying the Bhattacharyya distance to the output distributions of the set prediction networks.
arXiv Detail & Related papers (2023-04-14T09:31:17Z) - Unsupervised Ranking and Aggregation of Label Descriptions for Zero-Shot
Classifiers [8.434227773463022]
In a true zero-shot setup, designing good label descriptions is challenging because no development set is available.
We look at how probabilistic models of repeated rating analysis can be used for selecting the best label descriptions in an unsupervised fashion.
arXiv Detail & Related papers (2022-04-20T14:23:09Z) - Scalable Approach for Normalizing E-commerce Text Attributes (SANTA) [0.25782420501870296]
We present SANTA, a framework to automatically normalize E-commerce attribute values.
We first perform an extensive study of nine syntactic matching algorithms.
We argue that string similarity alone is not sufficient for attribute normalization.
arXiv Detail & Related papers (2021-06-12T08:45:56Z) - Unsupervised Label Refinement Improves Dataless Text Classification [48.031421660674745]
Dataless text classification is capable of classifying documents into previously unseen labels by assigning a score to any document paired with a label description.
While promising, it crucially relies on accurate descriptions of the label set for each downstream task.
This reliance causes dataless classifiers to be highly sensitive to the choice of label descriptions and hinders the broader application of dataless classification in practice.
arXiv Detail & Related papers (2020-12-08T03:37:50Z) - Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks.
We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Spying on your neighbors: Fine-grained probing of contextual embeddings
for information about surrounding words [12.394077144994617]
We introduce a suite of probing tasks that enable fine-grained testing of contextual embeddings for encoding of information about surrounding words.
We examine the popular BERT, ELMo and GPT contextual encoders and find that each of our tested information types is indeed encoded as contextual information across tokens.
We discuss implications of these results for how different types of models breakdown and prioritize word-level context information when constructing token embeddings.
arXiv Detail & Related papers (2020-05-04T19:34:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.