SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and
Synonym Discovery
- URL: http://arxiv.org/abs/2009.13827v1
- Date: Tue, 29 Sep 2020 07:32:17 GMT
- Title: SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and
Synonym Discovery
- Authors: Jiaming Shen and Wenda Qiu and Jingbo Shang and Michelle Vanni and
Xiang Ren and Jiawei Han
- Abstract summary: SynSetExpan is a novel framework that enables two tasks to mutually enhance each other.
We create the first large-scale Synonym-Enhanced Set Expansion dataset via crowdsourcing.
Experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.
- Score: 66.24624547470175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entity set expansion and synonym discovery are two critical NLP tasks.
Previous studies accomplish them separately, without exploring their
interdependencies. In this work, we hypothesize that these two tasks are
tightly coupled because two synonymous entities tend to have similar
likelihoods of belonging to various semantic classes. This motivates us to
design SynSetExpan, a novel framework that enables two tasks to mutually
enhance each other. SynSetExpan uses a synonym discovery model to include
popular entities' infrequent synonyms into the set, which boosts the set
expansion recall. Meanwhile, the set expansion model, being able to determine
whether an entity belongs to a semantic class, can generate pseudo training
data to fine-tune the synonym discovery model towards better accuracy. To
facilitate the research on studying the interplays of these two tasks, we
create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via
crowdsourcing. Extensive experiments on the SE2 dataset and previous benchmarks
demonstrate the effectiveness of SynSetExpan for both entity set expansion and
synonym discovery tasks.
Related papers
- Dual Encoder: Exploiting the Potential of Syntactic and Semantic for
Aspect Sentiment Triplet Extraction [19.375196127313348]
Aspect Sentiment Triple Extraction (ASTE) is an emerging task in fine-grained sentiment analysis.
We propose a dual-channel encoder with a BERT channel to capture semantic information, and an enhanced LSTM channel for comprehensive syntactic information capture.
We leverage the synergy of these modules to harness the significant potential of syntactic and semantic information in ASTE tasks.
arXiv Detail & Related papers (2024-02-23T15:07:13Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - GDA: Generative Data Augmentation Techniques for Relation Extraction
Tasks [81.51314139202152]
We propose a dedicated augmentation technique for relational texts, named GDA, which uses two complementary modules to preserve both semantic consistency and syntax structures.
Experimental results in three datasets under a low-resource setting showed that GDA could bring em 2.0% F1 improvements compared with no augmentation technique.
arXiv Detail & Related papers (2023-05-26T06:21:01Z) - Iteratively Improving Biomedical Entity Linking and Event Extraction via
Hard Expectation-Maximization [9.422435686239538]
Biomedical entity linking and event extraction are two crucial tasks to support text understanding and retrieval in the biomedical domain.
Previous research typically solves these two tasks separately or in a pipeline, leading to error propagation.
We propose joint biomedical entity linking and event extraction by regarding the event structures and entity references in knowledge bases as latent variables.
arXiv Detail & Related papers (2023-05-24T02:30:31Z) - From Alignment to Entailment: A Unified Textual Entailment Framework for
Entity Alignment [17.70562397382911]
Existing methods usually encode the triples of entities as embeddings and learn to align the embeddings.
We transform both triples into unified textual sequences, and model the EA task as a bi-directional textual entailment task.
Our approach captures the unified correlation pattern of two kinds of information between entities, and explicitly models the fine-grained interaction between original entity information.
arXiv Detail & Related papers (2023-05-19T08:06:50Z) - KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge
Graph [23.053995137917994]
We propose a novel entity synonyms discovery framework, named emphKGSynNet.
Specifically, we pre-train subword embeddings for mentions and entities using a large-scale domain-specific corpus.
We employ a specifically designed emphfusion gate to adaptively absorb the entities' knowledge information into their semantic features.
arXiv Detail & Related papers (2021-03-16T07:32:33Z) - A Co-Interactive Transformer for Joint Slot Filling and Intent Detection [61.109486326954205]
Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system.
Previous studies either model the two tasks separately or only consider the single information flow from intent to slot.
We propose a Co-Interactive Transformer to consider the cross-impact between the two tasks simultaneously.
arXiv Detail & Related papers (2020-10-08T10:16:52Z) - Empower Entity Set Expansion via Language Model Probing [58.78909391545238]
Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities.
A key challenge for entity set expansion is to avoid selecting ambiguous context features which will shift the class semantics and lead to accumulative errors in later iterations.
We propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue.
arXiv Detail & Related papers (2020-04-29T00:09:43Z) - CASE: Context-Aware Semantic Expansion [68.30244980290742]
This paper defines and studies a new task called Context-Aware Semantic Expansion (CASE)
Given a seed term in a sentential context, we aim to suggest other terms that well fit the context as the seed.
We show that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner.
arXiv Detail & Related papers (2019-12-31T06:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.