Prompting or Fine-tuning? A Comparative Study of Large Language Models
for Taxonomy Construction
- URL: http://arxiv.org/abs/2309.01715v1
- Date: Mon, 4 Sep 2023 16:53:17 GMT
- Title: Prompting or Fine-tuning? A Comparative Study of Large Language Models
for Taxonomy Construction
- Authors: Boqi Chen, Fandi Yi, D\'aniel Varr\'o
- Abstract summary: We present a general framework for taxonomy construction that takes into account structural constraints.
We compare the prompting and fine-tuning approaches performed on a hypernym taxonomy and a novel computer science taxonomy dataset.
- Score: 0.8670827427401335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Taxonomies represent hierarchical relations between entities, frequently
applied in various software modeling and natural language processing (NLP)
activities. They are typically subject to a set of structural constraints
restricting their content. However, manual taxonomy construction can be
time-consuming, incomplete, and costly to maintain. Recent studies of large
language models (LLMs) have demonstrated that appropriate user inputs (called
prompting) can effectively guide LLMs, such as GPT-3, in diverse NLP tasks
without explicit (re-)training. However, existing approaches for automated
taxonomy construction typically involve fine-tuning a language model by
adjusting model parameters. In this paper, we present a general framework for
taxonomy construction that takes into account structural constraints. We
subsequently conduct a systematic comparison between the prompting and
fine-tuning approaches performed on a hypernym taxonomy and a novel computer
science taxonomy dataset. Our result reveals the following: (1) Even without
explicit training on the dataset, the prompting approach outperforms
fine-tuning-based approaches. Moreover, the performance gap between prompting
and fine-tuning widens when the training dataset is small. However, (2)
taxonomies generated by the fine-tuning approach can be easily post-processed
to satisfy all the constraints, whereas handling violations of the taxonomies
produced by the prompting approach can be challenging. These evaluation
findings provide guidance on selecting the appropriate method for taxonomy
construction and highlight potential enhancements for both approaches.
Related papers
- The Art of Saying No: Contextual Noncompliance in Language Models [123.383993700586]
We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests.
Our taxonomy spans a wide range of categories including incomplete, unsupported, indeterminate, and humanizing requests.
To test noncompliance capabilities of language models, we use this taxonomy to develop a new evaluation suite of 1000 noncompliance prompts.
arXiv Detail & Related papers (2024-07-02T07:12:51Z) - Creating a Fine Grained Entity Type Taxonomy Using LLMs [0.0]
This study investigates the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy.
Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types.
This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base.
arXiv Detail & Related papers (2024-02-19T21:32:19Z) - Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples [34.88498567698853]
Chain-of-Layer is an incontext learning framework designed to induct from a given set of entities.
We show that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.
arXiv Detail & Related papers (2024-02-12T03:05:54Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Bringing motion taxonomies to continuous domains via GPLVM on hyperbolic manifolds [8.385386712928785]
Human motion serves as high-level hierarchical abstractions that classify how humans move and interact with their environment.
We propose to model taxonomy data via hyperbolic embeddings that capture the associated hierarchical structure.
We show that our model properly encodes unseen data from existing or new taxonomy categories, and outperforms its Euclidean and VAE-based counterparts.
arXiv Detail & Related papers (2022-10-04T15:19:24Z) - Convex Polytope Modelling for Unsupervised Derivation of Semantic
Structure for Data-efficient Natural Language Understanding [31.888489552069146]
A Convex-Polytopic-Model-based framework shows great potential in automatically extracting semantic patterns by exploiting the raw dialog corpus.
We show that this framework can exploit semantic-frame-related features in the corpus, reveal the underlying semantic structure of the utterances, and boost the performance of the state-of-the-art NLU model with minimal supervision.
arXiv Detail & Related papers (2022-01-25T19:12:44Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Octet: Online Catalog Taxonomy Enrichment with Self-Supervision [67.26804972901952]
We present a self-supervised end-to-end framework, Octet for Online Catalog EnrichmenT.
We propose to train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure.
Octet enriches an online catalog in production to 2 times larger in the open-world evaluation.
arXiv Detail & Related papers (2020-06-18T04:53:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.