Imposing Relation Structure in Language-Model Embeddings Using
Contrastive Learning
- URL: http://arxiv.org/abs/2109.00840v2
- Date: Sat, 4 Sep 2021 11:58:52 GMT
- Title: Imposing Relation Structure in Language-Model Embeddings Using
Contrastive Learning
- Authors: Christos Theodoropoulos, James Henderson, Andrei C. Coman,
Marie-Francine Moens
- Abstract summary: We propose a novel contrastive learning framework that trains sentence embeddings to encode the relations in a graph structure.
The resulting relation-aware sentence embeddings achieve state-of-the-art results on the relation extraction task.
- Score: 30.00047118880045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though language model text embeddings have revolutionized NLP research, their
ability to capture high-level semantic information, such as relations between
entities in text, is limited. In this paper, we propose a novel contrastive
learning framework that trains sentence embeddings to encode the relations in a
graph structure. Given a sentence (unstructured text) and its graph, we use
contrastive learning to impose relation-related structure on the token-level
representations of the sentence obtained with a CharacterBERT (El Boukkouri et
al.,2020) model. The resulting relation-aware sentence embeddings achieve
state-of-the-art results on the relation extraction task using only a simple
KNN classifier, thereby demonstrating the success of the proposed method.
Additional visualization by a tSNE analysis shows the effectiveness of the
learned representation space compared to baselines. Furthermore, we show that
we can learn a different space for named entity recognition, again using a
contrastive learning objective, and demonstrate how to successfully combine
both representation spaces in an entity-relation task.
Related papers
- Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition [36.59116507158687]
We introduce a unified framework of Contrastive Learning and Masked Image Modeling for STR (RCMSTR)
The proposed RCMSTR demonstrates superior performance in various STR-related downstream tasks, outperforming the existing state-of-the-art self-supervised STR techniques.
arXiv Detail & Related papers (2024-11-18T01:11:47Z) - Entity-Aware Self-Attention and Contextualized GCN for Enhanced Relation Extraction in Long Sentences [5.453850739960517]
We propose a novel model, Entity-aware Self-attention Contextualized GCN (ESC-GCN), which efficiently incorporates syntactic structure of input sentences and semantic context of sequences.
Our model achieves encouraging performance as compared to existing dependency-based and sequence-based models.
arXiv Detail & Related papers (2024-09-15T10:50:51Z) - Prompt-based Logical Semantics Enhancement for Implicit Discourse
Relation Recognition [4.7938839332508945]
We propose a Prompt-based Logical Semantics Enhancement (PLSE) method for Implicit Discourse Relation Recognition (IDRR)
Our method seamlessly injects knowledge relevant to discourse relation into pre-trained language models through prompt-based connective prediction.
Experimental results on PDTB 2.0 and CoNLL16 datasets demonstrate that our method achieves outstanding and consistent performance against the current state-of-the-art models.
arXiv Detail & Related papers (2023-11-01T08:38:08Z) - Dependency Induction Through the Lens of Visual Perception [81.91502968815746]
We propose an unsupervised grammar induction model that leverages word concreteness and a structural vision-based to jointly learn constituency-structure and dependency-structure grammars.
Our experiments show that the proposed extension outperforms the current state-of-the-art visually grounded models in constituency parsing even with a smaller grammar size.
arXiv Detail & Related papers (2021-09-20T18:40:37Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Neuro-Symbolic Representations for Video Captioning: A Case for
Leveraging Inductive Biases for Vision and Language [148.0843278195794]
We propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.
Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions.
arXiv Detail & Related papers (2020-11-18T20:21:19Z) - Named Entity Recognition and Relation Extraction using Enhanced Table
Filling by Contextualized Representations [14.614028420899409]
The proposed method computes representations for entity mentions and long-range dependencies without complicated hand-crafted features or neural-network architectures.
We also adapt a tensor dot-product to predict relation labels all at once without resorting to history-based predictions or search strategies.
Despite its simplicity, the experimental results demonstrate that the proposed method outperforms the state-of-the-art methods on the CoNLL04 and ACE05 English datasets.
arXiv Detail & Related papers (2020-10-15T04:58:23Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z) - A Dependency Syntactic Knowledge Augmented Interactive Architecture for
End-to-End Aspect-based Sentiment Analysis [73.74885246830611]
We propose a novel dependency syntactic knowledge augmented interactive architecture with multi-task learning for end-to-end ABSA.
This model is capable of fully exploiting the syntactic knowledge (dependency relations and types) by leveraging a well-designed Dependency Relation Embedded Graph Convolutional Network (DreGcn)
Extensive experimental results on three benchmark datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-04T14:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.