Entity Set Co-Expansion in StackOverflow
- URL: http://arxiv.org/abs/2212.02271v1
- Date: Mon, 5 Dec 2022 13:50:35 GMT
- Title: Entity Set Co-Expansion in StackOverflow
- Authors: Yu Zhang, Yunyi Zhang, Yucheng Jiang, Martin Michalski, Yu Deng,
Lucian Popa, ChengXiang Zhai, Jiawei Han
- Abstract summary: Given a few seed entities of a certain type, entity set expansion aims to discover an extensive set of entities that share the same type as the seeds.
We study the entity set co-expansion task in StackOverflow, which extracts Library, OS, Application, and Language entities from StackOverflow question-answer threads.
During the co-expansion process, we use PLMs to derive embeddings of candidate entities for calculating similarities between entities.
- Score: 49.64523055423687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a few seed entities of a certain type (e.g., Software or Programming
Language), entity set expansion aims to discover an extensive set of entities
that share the same type as the seeds. Entity set expansion in software-related
domains such as StackOverflow can benefit many downstream tasks (e.g., software
knowledge graph construction) and facilitate better IT operations and service
management. Meanwhile, existing approaches are less concerned with two
problems: (1) How to deal with multiple types of seed entities simultaneously?
(2) How to leverage the power of pre-trained language models (PLMs)? Being
aware of these two problems, in this paper, we study the entity set
co-expansion task in StackOverflow, which extracts Library, OS, Application,
and Language entities from StackOverflow question-answer threads. During the
co-expansion process, we use PLMs to derive embeddings of candidate entities
for calculating similarities between entities. Experimental results show that
our proposed SECoExpan framework outperforms previous approaches significantly.
Related papers
- OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting [49.655711022673046]
OneNet is an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning.
OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning.
arXiv Detail & Related papers (2024-10-10T02:45:23Z) - Effective Two-Stage Knowledge Transfer for Multi-Entity Cross-Domain Recommendation [26.027055867654866]
We propose a pre-training & fine-tuning based Multi-entity Knowledge Transfer framework called MKT.
M MKT utilizes a multi-entity pre-training module to extract transferable knowledge across different entities.
In the end, the extracted common knowledge is adopted for target entity model training.
arXiv Detail & Related papers (2024-02-29T12:29:58Z) - Two Heads Are Better Than One: Integrating Knowledge from Knowledge
Graphs and Large Language Models for Entity Alignment [31.70064035432789]
We propose a Large Language Model-enhanced Entity Alignment framework (LLMEA)
LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across Knowledge Graphs and edit distances to a virtual equivalent entity.
Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models.
arXiv Detail & Related papers (2024-01-30T12:41:04Z) - Seed-Guided Fine-Grained Entity Typing in Science and Engineering
Domains [51.02035914828596]
We study the task of seed-guided fine-grained entity typing in science and engineering domains.
We propose SEType which first enriches the weak supervision by finding more entities for each seen type from an unlabeled corpus.
It then matches the enriched entities to unlabeled text to get pseudo-labeled samples and trains a textual entailment model that can make inferences for both seen and unseen types.
arXiv Detail & Related papers (2024-01-23T22:36:03Z) - DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios.
Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers.
We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z) - PIE: a Parameter and Inference Efficient Solution for Large Scale
Knowledge Graph Embedding Reasoning [24.29409958504209]
We propose PIE, a textbfparameter and textbfinference textbfefficient solution.
Inspired from tensor decomposition methods, we find that decompose entity embedding matrix into low rank matrices can reduce more than half of the parameters.
To accelerate model inference, we propose a self-supervised auxiliary task, which can be seen as fine-grained entity typing.
arXiv Detail & Related papers (2022-04-29T09:06:56Z) - Contrastive Learning with Hard Negative Entities for Entity Set
Expansion [29.155036098444008]
Various NLP and IR applications will benefit from ESE due to its ability to discover knowledge.
We devise an entity-level masked language model with contrastive learning to refine the representation of entities.
In addition, we propose the ProbExpan, a novel probabilistic ESE framework utilizing the entity representation obtained by the aforementioned language model to expand entities.
arXiv Detail & Related papers (2022-04-16T12:26:42Z) - Parallel Instance Query Network for Named Entity Recognition [73.30174490672647]
Named entity recognition (NER) is a fundamental task in natural language processing.
Recent works treat named entity recognition as a reading comprehension task, constructing type-specific queries manually to extract entities.
We propose Parallel Instance Query Network (PIQN), which sets up global and learnable instance queries to extract entities in a parallel manner.
arXiv Detail & Related papers (2022-03-20T13:01:25Z) - Empower Entity Set Expansion via Language Model Probing [58.78909391545238]
Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities.
A key challenge for entity set expansion is to avoid selecting ambiguous context features which will shift the class semantics and lead to accumulative errors in later iterations.
We propose a novel iterative set expansion framework that leverages automatically generated class names to address the semantic drift issue.
arXiv Detail & Related papers (2020-04-29T00:09:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.