Unified BERT for Few-shot Natural Language Understanding
- URL: http://arxiv.org/abs/2206.12094v1
- Date: Fri, 24 Jun 2022 06:10:53 GMT
- Title: Unified BERT for Few-shot Natural Language Understanding
- Authors: JunYu Lu, Ping Yang, JiaXing Zhang, RuYi Gan, Jing Yang
- Abstract summary: We propose UBERT, a unified bidirectional language understanding model based on BERT framework.
UBERT encodes prior knowledge from various aspects, uniformly constructing learning representations across multiple NLU tasks.
Experiments show that UBERT achieves the state-of-the-art performance on 7 NLU tasks, 14 datasets on few-shot and zero-shot setting.
- Score: 7.352338840651369
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Even as pre-trained language models share a semantic encoder, natural
language understanding suffers from a diversity of output schemas. In this
paper, we propose UBERT, a unified bidirectional language understanding model
based on BERT framework, which can universally model the training objects of
different NLU tasks through a biaffine network. Specifically, UBERT encodes
prior knowledge from various aspects, uniformly constructing learning
representations across multiple NLU tasks, which is conducive to enhancing the
ability to capture common semantic understanding. Using the biaffine to model
scores pair of the start and end position of the original text, various
classification and extraction structures can be converted into a universal,
span-decoding approach. Experiments show that UBERT achieves the
state-of-the-art performance on 7 NLU tasks, 14 datasets on few-shot and
zero-shot setting, and realizes the unification of extensive information
extraction and linguistic reasoning tasks.
Related papers
- Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer [4.944761231728674]
We present a novel framework called "Lexicon-Syntax Enhanced Multilingual BERT"
We use Multilingual BERT as the base model and employ two techniques to enhance its learning capabilities.
Our experimental results demonstrate this framework can consistently outperform all baselines of zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2024-04-25T14:10:52Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Entity Aware Syntax Tree Based Data Augmentation for Natural Language
Understanding [5.02493891738617]
We propose a novel NLP data augmentation technique, which applies a tree structure, Entity Aware Syntax Tree (EAST) to represent sentences combined with attention on the entity.
Our EADA technique automatically constructs an EAST from a small amount of annotated data, and then generates a large number of training instances for intent detection and slot filling.
Experimental results on four datasets showed that the proposed technique significantly outperforms the existing data augmentation methods in terms of both accuracy and generalization ability.
arXiv Detail & Related papers (2022-09-06T07:34:10Z) - Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular
Vision-Language Pre-training [120.91411454661741]
We present a pre-trainable Universal-DEcoder Network (Uni-EDEN) to facilitate both vision-language perception and generation.
Uni-EDEN is a two-stream Transformer based structure, consisting of three modules: object and sentence encoders that separately learns the representations of each modality.
arXiv Detail & Related papers (2022-01-11T16:15:07Z) - Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text.
Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z) - On the Evolution of Syntactic Information Encoded by BERT's
Contextualized Representations [11.558645364193486]
In this paper, we analyze the evolution of the embedded syntax trees along the fine-tuning process of BERT for six different tasks.
Experimental results show that the encoded information is forgotten (PoS tagging), reinforced (dependency and constituency parsing) or preserved (semantics-related tasks) in different ways along the fine-tuning process depending on the task.
arXiv Detail & Related papers (2021-01-27T15:41:09Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - What the [MASK]? Making Sense of Language-Specific BERT Models [39.54532211263058]
This paper presents the current state of the art in language-specific BERT models.
Our aim is to provide an overview of the commonalities and differences between Language-language-specific BERT models and mBERT models.
arXiv Detail & Related papers (2020-03-05T20:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.