Transferring Semantic Knowledge Into Language Encoders
- URL: http://arxiv.org/abs/2110.07382v1
- Date: Thu, 14 Oct 2021 14:11:12 GMT
- Title: Transferring Semantic Knowledge Into Language Encoders
- Authors: Mohammad Umair, Francis Ferraro
- Abstract summary: We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders.
We show that this alignment can be learned implicitly via classification or directly via triplet loss.
Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
- Score: 6.85316573653194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce semantic form mid-tuning, an approach for transferring semantic
knowledge from semantic meaning representations into transformer-based language
encoders. In mid-tuning, we learn to align the text of general sentences -- not
tied to any particular inference task -- and structured semantic
representations of those sentences. Our approach does not require gold
annotated semantic representations. Instead, it makes use of automatically
generated semantic representations, such as from off-the-shelf PropBank and
FrameNet semantic parsers. We show that this alignment can be learned
implicitly via classification or directly via triplet loss. Our method yields
language encoders that demonstrate improved predictive performance across
inference, reading comprehension, textual similarity, and other semantic tasks
drawn from the GLUE, SuperGLUE, and SentEval benchmarks. We evaluate our
approach on three popular baseline models, where our experimental results and
analysis concludes that current pre-trained language models can further benefit
from structured semantic frames with the proposed mid-tuning method, as they
inject additional task-agnostic knowledge to the encoder, improving the
generated embeddings as well as the linguistic properties of the given model,
as evident from improvements on a popular sentence embedding toolkit and a
variety of probing tasks.
Related papers
- Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Towards Computationally Verifiable Semantic Grounding for Language
Models [18.887697890538455]
The paper conceptualizes the LM as a conditional model generating text given a desired semantic message formalized as a set of entity-relationship triples.
It embeds the LM in an auto-encoder by feeding its output to a semantic fluency whose output is in the same representation domain as the input message.
We show that our proposed approaches significantly improve on the greedy search baseline.
arXiv Detail & Related papers (2022-11-16T17:35:52Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Semantics-Aware Inferential Network for Natural Language Understanding [79.70497178043368]
We propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation.
Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues.
Our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
arXiv Detail & Related papers (2020-04-28T07:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.