Related papers: FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models

FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models

URL: http://arxiv.org/abs/2402.13623v1
Date: Wed, 21 Feb 2024 08:50:40 GMT
Title: FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models
Authors: Sahil Mishra, Ujjwal Sudev, Tanmoy Chakraborty
Abstract summary: Taxonomies find utility in various real-world applications, such as e-commerce search engines and recommendation systems. Traditional supervised taxonomy expansion approaches encounter difficulties stemming from limited resources. We propose FLAME, a novel approach for taxonomy expansion in low-resource environments by harnessing the capabilities of large language models.
Score: 19.863010475923414
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Taxonomies represent an arborescence hierarchical structure that establishes relationships among entities to convey knowledge within a specific domain. Each edge in the taxonomy signifies a hypernym-hyponym relationship. Taxonomies find utility in various real-world applications, such as e-commerce search engines and recommendation systems. Consequently, there arises a necessity to enhance these taxonomies over time. However, manually curating taxonomies with neoteric data presents challenges due to limitations in available human resources and the exponential growth of data. Therefore, it becomes imperative to develop automatic taxonomy expansion methods. Traditional supervised taxonomy expansion approaches encounter difficulties stemming from limited resources, primarily due to the small size of existing taxonomies. This scarcity of training data often leads to overfitting. In this paper, we propose FLAME, a novel approach for taxonomy expansion in low-resource environments by harnessing the capabilities of large language models that are trained on extensive real-world knowledge. LLMs help compensate for the scarcity of domain-specific knowledge. Specifically, FLAME leverages prompting in few-shot settings to extract the inherent knowledge within the LLMs, ascertaining the hypernym entities within the taxonomy. Furthermore, it employs reinforcement learning to fine-tune the large language models, resulting in more accurate predictions. Experiments on three real-world benchmark datasets demonstrate the effectiveness of FLAME in real-world scenarios, achieving a remarkable improvement of 18.5% in accuracy and 12.3% in Wu & Palmer metric over eight baselines. Furthermore, we elucidate the strengths and weaknesses of FLAME through an extensive case study, error analysis and ablation studies on the benchmarks.

Related papers

A Multi-Stage Framework with Taxonomy-Guided Reasoning for Occupation Classification Using Large Language Models [13.350477885980512]
Large language models (LLMs) hold promise due to their extensive world knowledge and in-context learning capabilities. We propose a multi-stage framework consisting of inference, retrieval, and reranking stages. Our results indicate that the framework outperforms existing LLM-based methods.
arXiv Detail & Related papers (2025-03-17T09:44:50Z)
QuanTaxo: A Quantum Approach to Self-Supervised Taxonomy Expansion [17.865428778692557]
We introduce QuanTaxo, an innovative quantum-inspired framework for taxonomy expansion. We show that QuanTaxo significantly outperforms classical embedding models. We also highlight the superiority of QuanTaxo through extensive ablation and case studies.
arXiv Detail & Related papers (2025-01-23T18:40:02Z)
Automatic Bottom-Up Taxonomy Construction: A Software Application Domain Study [6.0158981171030685]
Previous research in software application domain classification has faced challenges due to the lack of a proper taxonomy. This study aims to develop a comprehensive software application domain taxonomy by integrating multiple datasources and leveraging ensemble methods.
arXiv Detail & Related papers (2024-09-24T08:55:07Z)
Are Large Language Models a Good Replacement of Taxonomies? [25.963448807848746]
Large language models (LLMs) demonstrate an impressive ability to internalize knowledge and answer natural language questions. We ask if the schema of knowledge graph (i.e., taxonomy) is made obsolete by LLMs.
arXiv Detail & Related papers (2024-06-17T01:21:50Z)
Creating a Fine Grained Entity Type Taxonomy Using LLMs [0.0]
This study investigates the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types. This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base.
arXiv Detail & Related papers (2024-02-19T21:32:19Z)
Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy. We present a new method that allows achieving high results on this task with little effort. We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z)
Large-scale Taxonomy Induction Using Entity and Word Embeddings [13.30719395448771]
We propose TIEmb, an approach for automatic subsumption extraction from knowledge using entity and text embeddings. We apply the approach on the WebIsA database, a database of classes subsumption relations extracted from the large portion of Wide Web, to extract hierarchies in the Person and Place domain.
arXiv Detail & Related papers (2021-05-04T05:53:12Z)
Who Should Go First? A Self-Supervised Concept Sorting Model for Improving Taxonomy Expansion [50.794640012673064]
As data and business scope grow in real applications, existing need to be expanded to incorporate new concepts. Previous works on taxonomy expansion process the new concepts independently and simultaneously, ignoring the potential relationships among them and the appropriate order of inserting operations. We propose TaxoOrder, a novel self-supervised framework that simultaneously discovers the local hypernym-hyponym structure among new concepts and decides the order of insertion.
arXiv Detail & Related papers (2021-04-08T11:00:43Z)
Studying Taxonomy Enrichment on Diachronic WordNet Versions [70.27072729280528]
We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages. We create novel English and Russian datasets for training and evaluating taxonomy enrichment models and describe a technique of creating such datasets for other languages.
arXiv Detail & Related papers (2020-11-23T16:49:37Z)
Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT. We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure. Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)
STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths [53.45704816829921]
We propose a self-supervised taxonomy expansion model named STEAM. STEAM generates natural self-supervision signals, and formulates a node attachment prediction task. Experiments show STEAM outperforms state-of-the-art methods for taxonomy expansion by 11.6% in accuracy and 7.0% in mean reciprocal rank.
arXiv Detail & Related papers (2020-06-18T00:32:53Z)
TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network [62.12557274257303]
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications. We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data. We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
arXiv Detail & Related papers (2020-01-26T21:30:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.