Using Zero-shot Prompting in the Automatic Creation and Expansion of
Topic Taxonomies for Tagging Retail Banking Transactions
- URL: http://arxiv.org/abs/2401.06790v2
- Date: Sun, 11 Feb 2024 15:54:58 GMT
- Title: Using Zero-shot Prompting in the Automatic Creation and Expansion of
Topic Taxonomies for Tagging Retail Banking Transactions
- Authors: Daniel de S. Moraes, Pedro T. C. Santos, Polyana B. da Costa, Matheus
A. S. Pinto, Ivan de J. P. Pinto, \'Alvaro M. G. da Veiga, Sergio Colcher,
Antonio J. G. Busson, Rafael H. Rocha, Rennan Gaio, Rafael Miceli, Gabriela
Tourinho, Marcos Rabaioli, Leandro Santos, Fellipe Marques, David Favaro
- Abstract summary: This work presents an unsupervised method for constructing and expanding topic using instruction-based fine-tuned LLMs (Large Language Models)
To expand an existing taxonomy with new terms, we use zero-shot prompting to find out where to add new nodes.
We use the resulting tags to assign tags that characterize merchants from a retail bank dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents an unsupervised method for automatically constructing and
expanding topic taxonomies using instruction-based fine-tuned LLMs (Large
Language Models). We apply topic modeling and keyword extraction techniques to
create initial topic taxonomies and LLMs to post-process the resulting terms
and create a hierarchy. To expand an existing taxonomy with new terms, we use
zero-shot prompting to find out where to add new nodes, which, to our
knowledge, is the first work to present such an approach to taxonomy tasks. We
use the resulting taxonomies to assign tags that characterize merchants from a
retail bank dataset. To evaluate our work, we asked 12 volunteers to answer a
two-part form in which we first assessed the quality of the taxonomies created
and then the tags assigned to merchants based on that taxonomy. The evaluation
revealed a coherence rate exceeding 90% for the chosen taxonomies. The
taxonomies' expansion with LLMs also showed exciting results for parent node
prediction, with an f1-score above 70% in our taxonomies.
Related papers
- Creating a Fine Grained Entity Type Taxonomy Using LLMs [0.0]
This study investigates the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy.
Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types.
This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base.
arXiv Detail & Related papers (2024-02-19T21:32:19Z) - Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples [34.88498567698853]
Chain-of-Layer is an incontext learning framework designed to induct from a given set of entities.
We show that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.
arXiv Detail & Related papers (2024-02-12T03:05:54Z) - Towards Visual Taxonomy Expansion [50.462998483087915]
We propose Visual Taxonomy Expansion (VTE), introducing visual features into the taxonomy expansion task.
We propose a textual hypernymy learning task and a visual prototype learning task to cluster textual and visual semantics.
Our method is evaluated on two datasets, where we obtain compelling results.
arXiv Detail & Related papers (2023-09-12T10:17:28Z) - TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel
Topic Clusters [57.59286394188025]
We propose a novel framework for topic taxonomy completion, named TaxoCom.
TaxoCom discovers novel sub-topic clusters of terms and documents.
Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage.
arXiv Detail & Related papers (2022-01-18T07:07:38Z) - Who Should Go First? A Self-Supervised Concept Sorting Model for
Improving Taxonomy Expansion [50.794640012673064]
As data and business scope grow in real applications, existing need to be expanded to incorporate new concepts.
Previous works on taxonomy expansion process the new concepts independently and simultaneously, ignoring the potential relationships among them and the appropriate order of inserting operations.
We propose TaxoOrder, a novel self-supervised framework that simultaneously discovers the local hypernym-hyponym structure among new concepts and decides the order of insertion.
arXiv Detail & Related papers (2021-04-08T11:00:43Z) - Can Taxonomy Help? Improving Semantic Question Matching using Question
Taxonomy [37.57300969050908]
We propose a hybrid technique for semantic question matching.
It uses our proposed two-layered taxonomy for English questions by augmenting state-of-the-art deep learning models with question classes obtained from a deep learning based question.
arXiv Detail & Related papers (2021-01-20T16:23:04Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z) - STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths [53.45704816829921]
We propose a self-supervised taxonomy expansion model named STEAM.
STEAM generates natural self-supervision signals, and formulates a node attachment prediction task.
Experiments show STEAM outperforms state-of-the-art methods for taxonomy expansion by 11.6% in accuracy and 7.0% in mean reciprocal rank.
arXiv Detail & Related papers (2020-06-18T00:32:53Z) - TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced
Graph Neural Network [62.12557274257303]
Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.
We propose a novel self-supervised framework, named TaxoExpan, which automatically generates a set of query concept, anchor concept> pairs from the existing taxonomy as training data.
We develop two innovative techniques in TaxoExpan: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.
arXiv Detail & Related papers (2020-01-26T21:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.