InSRL: A Multi-view Learning Framework Fusing Multiple Information
Sources for Distantly-supervised Relation Extraction
- URL: http://arxiv.org/abs/2012.09370v1
- Date: Thu, 17 Dec 2020 02:49:46 GMT
- Title: InSRL: A Multi-view Learning Framework Fusing Multiple Information
Sources for Distantly-supervised Relation Extraction
- Authors: Zhendong Chu, Haiyun Jiang, Yanghua Xiao, Wei Wang
- Abstract summary: We introduce two widely-existing sources in knowledge bases, namely entity descriptions and multi-grained entity types.
An end-to-end multi-view learning framework is proposed for relation extraction via Intact Space Representation Learning (InSRL)
- Score: 19.176183245280267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distant supervision makes it possible to automatically label bags of
sentences for relation extraction by leveraging knowledge bases, but suffers
from the sparse and noisy bag issues. Additional information sources are
urgently needed to supplement the training data and overcome these issues. In
this paper, we introduce two widely-existing sources in knowledge bases, namely
entity descriptions, and multi-grained entity types to enrich the distantly
supervised data. We see information sources as multiple views and fusing them
to construct an intact space with sufficient information. An end-to-end
multi-view learning framework is proposed for relation extraction via Intact
Space Representation Learning (InSRL), and the representations of single views
are jointly learned simultaneously. Moreover, inner-view and cross-view
attention mechanisms are used to highlight important information on different
levels on an entity-pair basis. The experimental results on a popular benchmark
dataset demonstrate the necessity of additional information sources and the
effectiveness of our framework. We will release the implementation of our model
and dataset with multiple information sources after the anonymized review
phase.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Learning Representations without Compositional Assumptions [79.12273403390311]
We propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges.
We also introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically.
arXiv Detail & Related papers (2023-05-31T10:36:10Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Cross-view Graph Contrastive Representation Learning on Partially
Aligned Multi-view Data [52.491074276133325]
Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields.
We propose a new cross-view graph contrastive learning framework, which integrates multi-view information to align data and learn latent representations.
Experiments conducted on several real datasets demonstrate the effectiveness of the proposed method on the clustering and classification tasks.
arXiv Detail & Related papers (2022-11-08T09:19:32Z) - Dual Representation Learning for One-Step Clustering of Multi-View Data [30.131568561100817]
We propose a novel one-step multi-view clustering method by exploiting the dual representation of both the common and specific information of different views.
With this framework, the representation learning and clustering partition mutually benefit each other, which effectively improve the clustering performance.
arXiv Detail & Related papers (2022-08-30T14:20:26Z) - Self-Supervised Information Bottleneck for Deep Multi-View Subspace
Clustering [29.27475285925792]
We establish a new framework called Self-supervised Information Bottleneck based Multi-view Subspace Clustering (SIB-MSC)
Inheriting the advantages from information bottleneck, SIB-MSC can learn a latent space for each view to capture common information among the latent representations of different views.
Our method achieves superior performance over the related state-of-the-art methods.
arXiv Detail & Related papers (2022-04-26T15:49:59Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - Learning Robust Representations via Multi-View Information Bottleneck [41.65544605954621]
Original formulation requires labeled data to identify superfluous information.
We extend this ability to the multi-view unsupervised setting, where two views of the same underlying entity are provided but the label is unknown.
A theoretical analysis leads to the definition of a new multi-view model that produces state-of-the-art results on the Sketchy dataset and label-limited versions of the MIR-Flickr dataset.
arXiv Detail & Related papers (2020-02-17T16:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.