Prompting or Fine-tuning? A Comparative Study of Large Language Models
for Taxonomy Construction
- URL: http://arxiv.org/abs/2309.01715v1
- Date: Mon, 4 Sep 2023 16:53:17 GMT
- Title: Prompting or Fine-tuning? A Comparative Study of Large Language Models
for Taxonomy Construction
- Authors: Boqi Chen, Fandi Yi, D\'aniel Varr\'o
- Abstract summary: We present a general framework for taxonomy construction that takes into account structural constraints.
We compare the prompting and fine-tuning approaches performed on a hypernym taxonomy and a novel computer science taxonomy dataset.
- Score: 0.8670827427401335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Taxonomies represent hierarchical relations between entities, frequently
applied in various software modeling and natural language processing (NLP)
activities. They are typically subject to a set of structural constraints
restricting their content. However, manual taxonomy construction can be
time-consuming, incomplete, and costly to maintain. Recent studies of large
language models (LLMs) have demonstrated that appropriate user inputs (called
prompting) can effectively guide LLMs, such as GPT-3, in diverse NLP tasks
without explicit (re-)training. However, existing approaches for automated
taxonomy construction typically involve fine-tuning a language model by
adjusting model parameters. In this paper, we present a general framework for
taxonomy construction that takes into account structural constraints. We
subsequently conduct a systematic comparison between the prompting and
fine-tuning approaches performed on a hypernym taxonomy and a novel computer
science taxonomy dataset. Our result reveals the following: (1) Even without
explicit training on the dataset, the prompting approach outperforms
fine-tuning-based approaches. Moreover, the performance gap between prompting
and fine-tuning widens when the training dataset is small. However, (2)
taxonomies generated by the fine-tuning approach can be easily post-processed
to satisfy all the constraints, whereas handling violations of the taxonomies
produced by the prompting approach can be challenging. These evaluation
findings provide guidance on selecting the appropriate method for taxonomy
construction and highlight potential enhancements for both approaches.
Related papers
- How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Automatic Bottom-Up Taxonomy Construction: A Software Application Domain Study [6.0158981171030685]
Previous research in software application domain classification has faced challenges due to the lack of a proper taxonomy.
This study aims to develop a comprehensive software application domain taxonomy by integrating multiple datasources and leveraging ensemble methods.
arXiv Detail & Related papers (2024-09-24T08:55:07Z) - CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts [40.52605902842168]
textscCodeTaxo is a novel approach that leverages large language models through code language prompts to capture the taxonomic structure.
Experiments on five real-world benchmarks from different domains demonstrate that textscCodeTaxo consistently achieves superior performance across all evaluation metrics.
arXiv Detail & Related papers (2024-08-17T02:15:07Z) - The Art of Saying No: Contextual Noncompliance in Language Models [123.383993700586]
We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests.
Our taxonomy spans a wide range of categories including incomplete, unsupported, indeterminate, and humanizing requests.
To test noncompliance capabilities of language models, we use this taxonomy to develop a new evaluation suite of 1000 noncompliance prompts.
arXiv Detail & Related papers (2024-07-02T07:12:51Z) - Creating a Fine Grained Entity Type Taxonomy Using LLMs [0.0]
This study investigates the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy.
Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types.
This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base.
arXiv Detail & Related papers (2024-02-19T21:32:19Z) - Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples [34.88498567698853]
Chain-of-Layer is an incontext learning framework designed to induct from a given set of entities.
We show that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.
arXiv Detail & Related papers (2024-02-12T03:05:54Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.