Exploring Prompt-Based Methods for Zero-Shot Hypernym Prediction with
Large Language Models
- URL: http://arxiv.org/abs/2401.04515v1
- Date: Tue, 9 Jan 2024 12:13:55 GMT
- Title: Exploring Prompt-Based Methods for Zero-Shot Hypernym Prediction with
Large Language Models
- Authors: Mikhail Tikhomirov and Natalia Loukachevitch
- Abstract summary: This article investigates a zero-shot approach to hypernymy prediction using large language models (LLMs)
Experiments demonstrate a strong correlation between the effectiveness of language model prompts and classic patterns.
We also explore prompts for predicting co-hyponyms and improving hypernymy predictions by augmenting prompts with additional information.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This article investigates a zero-shot approach to hypernymy prediction using
large language models (LLMs). The study employs a method based on text
probability calculation, applying it to various generated prompts. The
experiments demonstrate a strong correlation between the effectiveness of
language model prompts and classic patterns, indicating that preliminary prompt
selection can be carried out using smaller models before moving to larger ones.
We also explore prompts for predicting co-hyponyms and improving hypernymy
predictions by augmenting prompts with additional information through
automatically identified co-hyponyms. An iterative approach is developed for
predicting higher-level concepts, which further improves the quality on the
BLESS dataset (MAP = 0.8).
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Improving Explainability of Softmax Classifiers Using a Prototype-Based Joint Embedding Method [0.0]
We propose a prototype-based approach for improving explainability of softmax classifiers.
By modifying the model architecture and training, we acquire the ability to sample for prototypical examples that contributed to the prediction.
We obtain a metric for uncertainty that is better able to detect out of distribution data than softmax confidence.
arXiv Detail & Related papers (2024-07-02T13:59:09Z) - Prompt Mining for Language-based Human Mobility Forecasting [10.325794804095889]
We propose a novel framework for prompt mining in language-based mobility forecasting.
The framework includes a prompt generation stage based on the information entropy of prompts and a prompt refinement stage to integrate mechanisms such as the chain of thought.
arXiv Detail & Related papers (2024-03-06T08:43:30Z) - A Lightweight Generative Model for Interpretable Subject-level Prediction [0.07989135005592125]
We propose a technique for single-subject prediction that is inherently interpretable.
Experiments demonstrate that the resulting model can be efficiently inverted to make accurate subject-level predictions.
arXiv Detail & Related papers (2023-06-19T18:20:29Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - HYPER: Learned Hybrid Trajectory Prediction via Factored Inference and
Adaptive Sampling [27.194900145235007]
We introduce HYPER, a general and expressive hybrid prediction framework.
By modeling traffic agents as a hybrid discrete-continuous system, our approach is capable of predicting discrete intent changes over time.
We train and validate our model on the Argoverse dataset, and demonstrate its effectiveness through comprehensive ablation studies and comparisons with state-of-the-art models.
arXiv Detail & Related papers (2021-10-05T20:20:10Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - Prediction-Centric Learning of Independent Cascade Dynamics from Partial
Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate.
We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach.
We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.