Siamese Graph Neural Networks for Data Integration
- URL: http://arxiv.org/abs/2001.06543v1
- Date: Fri, 17 Jan 2020 21:51:55 GMT
- Title: Siamese Graph Neural Networks for Data Integration
- Authors: Evgeny Krivosheev, Mattia Atzeni, Katsiaryna Mirylenka, Paolo Scotton,
Fabio Casati
- Abstract summary: We propose a general approach to modeling and integrating entities from structured data, such as relational databases, as well as unstructured sources, such as free text from news articles.
Our approach is designed to explicitly model and leverage relations between entities, thereby using all available information and preserving as much context as possible.
We evaluate our method on the task of integrating data about business entities, and we demonstrate that it outperforms standard rule-based systems, as well as other deep learning approaches that do not use graph-based representations.
- Score: 11.41207739004894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data integration has been studied extensively for decades and approached from
different angles. However, this domain still remains largely rule-driven and
lacks universal automation. Recent development in machine learning and in
particular deep learning has opened the way to more general and more efficient
solutions to data integration problems. In this work, we propose a general
approach to modeling and integrating entities from structured data, such as
relational databases, as well as unstructured sources, such as free text from
news articles. Our approach is designed to explicitly model and leverage
relations between entities, thereby using all available information and
preserving as much context as possible. This is achieved by combining siamese
and graph neural networks to propagate information between connected entities
and support high scalability. We evaluate our method on the task of integrating
data about business entities, and we demonstrate that it outperforms standard
rule-based systems, as well as other deep learning approaches that do not use
graph-based representations.
Related papers
- Towards Graph Prompt Learning: A Survey and Beyond [38.55555996765227]
Large-scale "pre-train and prompt learning" paradigms have demonstrated remarkable adaptability.
This survey categorizes over 100 relevant works in this field, summarizing general design principles and the latest applications.
arXiv Detail & Related papers (2024-08-26T06:36:42Z) - Bridging Local Details and Global Context in Text-Attributed Graphs [62.522550655068336]
GraphBridge is a framework that bridges local and global perspectives by leveraging contextual textual information.
Our method achieves state-of-theart performance, while our graph-aware token reduction module significantly enhances efficiency and solves scalability issues.
arXiv Detail & Related papers (2024-06-18T13:35:25Z) - Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - Homological Convolutional Neural Networks [4.615338063719135]
We propose a novel deep learning architecture that exploits the data structural organization through topologically constrained network representations.
We test our model on 18 benchmark datasets against 5 classic machine learning and 3 deep learning models.
arXiv Detail & Related papers (2023-08-26T08:48:51Z) - Learning Representations without Compositional Assumptions [79.12273403390311]
We propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges.
We also introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically.
arXiv Detail & Related papers (2023-05-31T10:36:10Z) - Federated Learning over Harmonized Data Silos [0.7106986689736825]
Federated Learning is a distributed machine learning approach that enables geographically distributed data silos to collaboratively learn a joint machine learning model without sharing data.
We propose an architectural vision for an end-to-end Federated Learning and Integration system, incorporating the critical steps of data harmonization and data imputation.
arXiv Detail & Related papers (2023-05-15T19:55:51Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - End-to-End Hierarchical Relation Extraction for Generic Form
Understanding [0.6299766708197884]
We present a novel deep neural network to jointly perform both entity detection and link prediction.
Our model extends the Multi-stage Attentional U-Net architecture with the Part-Intensity Fields and Part-Association Fields for link prediction.
We demonstrate the effectiveness of the model on the Form Understanding in Noisy Scanned Documents dataset.
arXiv Detail & Related papers (2021-06-02T06:51:35Z) - Business Entity Matching with Siamese Graph Convolutional Networks [0.9786690381850356]
Recent developments in machine learning and in particular deep learning have opened the way to more general and efficient solutions to data-integration tasks.
We demonstrate an approach that allows modeling and integrating entities by leveraging their relations and contextual information.
arXiv Detail & Related papers (2021-05-08T13:47:52Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.