Related papers: Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

URL: http://arxiv.org/abs/2106.04174v1
Date: Tue, 8 Jun 2021 08:27:31 GMT
Title: Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making
Authors: Zijun Yao, Chengjiang Li, Tiansi Dong, Xin Lv, Jifan Yu, Lei Hou, Juanzi Li, Yichi Zhang, Zelin Dai
Abstract summary: Entity Matching aims at recognizing entity records that denote the same real-world object. We propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Key Attribute Tree (KAT) Induction. Our method is highly efficient and outperforms SOTA EM models in most cases.
Score: 22.755892575582788
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Entity Matching (EM) aims at recognizing entity records that denote the same real-world object. Neural EM models learn vector representation of entity descriptions and match entities end-to-end. Though robust, these methods require many resources for training, and lack of interpretability. In this paper, we propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Key Attribute Tree (KAT) Induction to decouple feature representation from matching decision. Using self-supervised learning and mask mechanism in pre-trained language modeling, HIF learns the embeddings of noisy attribute values by inter-attribute attention with unlabeled data. Using a set of comparison features and a limited amount of annotated data, KAT Induction learns an efficient decision tree that can be interpreted by generating entity matching rules whose structure is advocated by domain experts. Experiments on 6 public datasets and 3 industrial datasets show that our method is highly efficient and outperforms SOTA EM models in most cases. Our codes and datasets can be obtained from https://github.com/THU-KEG/HIF-KAT.

Related papers

Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC) We first generate anchor entities within the relation-aware neighborhood of the head entity. Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z)
LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs [22.621781704528786]
Embedding-based entity alignment (EA) has recently gained considerable attention. EA seeks to identify and match corresponding entities across different Knowledge Graphs (KGs)
arXiv Detail & Related papers (2024-12-06T01:05:37Z)
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data. We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation. Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z)
Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning [32.62763647036567]
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. We propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER) MsFNER splits the general NER into two stages: entity-span detection and entity classification.
arXiv Detail & Related papers (2024-04-10T12:31:09Z)
Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction. We reformulate the task to be entity-centric, enabling the use of diverse metrics. We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z)
Two Heads Are Better Than One: Integrating Knowledge from Knowledge Graphs and Large Language Models for Entity Alignment [31.70064035432789]
We propose a Large Language Model-enhanced Entity Alignment framework (LLMEA) LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across Knowledge Graphs and edit distances to a virtual equivalent entity. Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models.
arXiv Detail & Related papers (2024-01-30T12:41:04Z)
EchoEA: Echo Information between Entities and Relations for Entity Alignment [1.1470070927586016]
We propose a novel framework, Echo Entity Alignment (EchoEA), which leverages self-attention mechanism to spread entity information to relations and echo back to entities. The experimental results on three real-world cross-lingual datasets are stable at around 96% at hits@1 on average.
arXiv Detail & Related papers (2021-07-07T07:34:21Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
Exploring and Evaluating Attributes, Values, and Structures for Entity Alignment [100.19568734815732]
Entity alignment (EA) aims at building a unified Knowledge Graph (KG) of rich content by linking the equivalent entities from various KGs. attribute triples can also provide crucial alignment signal but have not been well explored yet. We propose to utilize an attributed value encoder and partition the KG into subgraphs to model the various types of attribute triples efficiently.
arXiv Detail & Related papers (2020-10-07T08:03:58Z)
Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types. We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.