Enhancing Language Representation with Constructional Information for
Natural Language Understanding
- URL: http://arxiv.org/abs/2306.02819v1
- Date: Mon, 5 Jun 2023 12:15:12 GMT
- Title: Enhancing Language Representation with Constructional Information for
Natural Language Understanding
- Authors: Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Zhilin Gong, Ming Cai,
Tianxiang Wang
- Abstract summary: We introduce construction grammar (CxG), which highlights the pairings of form and meaning.
We adopt usage-based construction grammar as the basis of our work.
A HyCxG framework is proposed to enhance language representation through a three-stage solution.
- Score: 5.945710973349298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language understanding (NLU) is an essential branch of natural
language processing, which relies on representations generated by pre-trained
language models (PLMs). However, PLMs primarily focus on acquiring
lexico-semantic information, while they may be unable to adequately handle the
meaning of constructions. To address this issue, we introduce construction
grammar (CxG), which highlights the pairings of form and meaning, to enrich
language representation. We adopt usage-based construction grammar as the basis
of our work, which is highly compatible with statistical models such as PLMs.
Then a HyCxG framework is proposed to enhance language representation through a
three-stage solution. First, all constructions are extracted from sentences via
a slot-constraints approach. As constructions can overlap with each other,
bringing redundancy and imbalance, we formulate the conditional max coverage
problem for selecting the discriminative constructions. Finally, we propose a
relational hypergraph attention network to acquire representation from
constructional information by capturing high-order word interactions among
constructions. Extensive experiments demonstrate the superiority of the
proposed model on a variety of NLU tasks.
Related papers
- Emergent Linguistic Structures in Neural Networks are Fragile [20.692540987792732]
Large Language Models (LLMs) have been reported to have strong performance on natural language processing tasks.
We propose a framework to assess the consistency and robustness of linguistic representations.
arXiv Detail & Related papers (2022-10-31T15:43:57Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - Language-Based Causal Representation Learning [24.008923963650226]
We show that the dynamics is learned over a suitable domain-independent first-order causal language.
The preference for the most compact representation in the language that is compatible with the data provides a strong and meaningful learning bias.
While "classical AI" requires handcrafted representations, similar representations can be learned from unstructured data over the same languages.
arXiv Detail & Related papers (2022-07-12T02:07:58Z) - Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
Pre-trained Language Models [62.41139712595334]
We propose a novel pre-training paradigm for Chinese -- Lattice-BERT.
We construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers.
We show that our model can bring an average increase of 1.5% under the 12-layer setting.
arXiv Detail & Related papers (2021-04-15T02:36:49Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - A Generative Model for Joint Natural Language Understanding and
Generation [9.810053382574017]
We propose a generative model which couples NLU and NLG through a shared latent variable.
Our model achieves state-of-the-art performance on two dialogue datasets with both flat and tree-structured formal representations.
We also show that the model can be trained in a semi-supervised fashion by utilising unlabelled data to boost its performance.
arXiv Detail & Related papers (2020-06-12T22:38:55Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.