Prompting as Probing: Using Language Models for Knowledge Base
Construction
- URL: http://arxiv.org/abs/2208.11057v3
- Date: Mon, 19 Jun 2023 15:06:46 GMT
- Title: Prompting as Probing: Using Language Models for Knowledge Base
Construction
- Authors: Dimitrios Alivanistos, Selene B\'aez Santamar\'ia, Michael Cochez,
Jan-Christoph Kalo, Emile van Krieken, Thiviyan Thanapalasingam
- Abstract summary: We present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020.
ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this.
Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions.
- Score: 1.6050172226234583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language Models (LMs) have proven to be useful in various downstream
applications, such as summarisation, translation, question answering and text
classification. LMs are becoming increasingly important tools in Artificial
Intelligence, because of the vast quantity of information they can store. In
this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a
large Language Model originally proposed by OpenAI in 2020, to perform the task
of Knowledge Base Construction (KBC). ProP implements a multi-step approach
that combines a variety of prompting techniques to achieve this. Our results
show that manual prompt curation is essential, that the LM must be encouraged
to give answer sets of variable lengths, in particular including empty answer
sets, that true/false questions are a useful device to increase precision on
suggestions generated by the LM, that the size of the LM is a crucial factor,
and that a dictionary of entity aliases improves the LM score. Our evaluation
study indicates that these proposed techniques can substantially enhance the
quality of the final predictions: ProP won track 2 of the LM-KBC competition,
outperforming the baseline by 36.4 percentage points. Our implementation is
available on https://github.com/HEmile/iswc-challenge.
Related papers
- Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification (UQ) is a critical component of machine learning (ML) applications.
We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.
We conduct a large-scale empirical investigation of UQ and normalization techniques across nine tasks, and identify the most promising approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation [76.31621715032558]
Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses.
We introduce CaLM, a novel verification framework.
Our framework empowers smaller LMs, which rely less on parametric memory, to validate the output of larger LMs.
arXiv Detail & Related papers (2024-06-08T06:04:55Z) - Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought [51.240387516059535]
We introduce a novel framework, LM-Guided CoT, that leverages a lightweight (i.e., 1B) language model (LM) for guiding a black-box large (i.e., >10B) LM in reasoning tasks.
We optimize the model through 1) knowledge distillation and 2) reinforcement learning from rationale-oriented and task-oriented reward signals.
arXiv Detail & Related papers (2024-04-04T12:46:37Z) - DSPy Assertions: Computational Constraints for Self-Refining Language
Model Pipelines [41.779902953557425]
Chaining language model (LM) calls as composable modules is fueling a new way of programming.
We introduce LM Assertions, a construct for expressing computational constraints that LMs should satisfy.
We present new strategies that allow DSPy to compile programs with LM Assertions into more reliable and accurate systems.
arXiv Detail & Related papers (2023-12-20T19:13:26Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - Don't Generate, Discriminate: A Proposal for Grounding Language Models
to Real-World Environments [11.496084599325807]
Pangu is a generic framework for grounded language understanding.
It capitalizes on the discriminative ability of LMs instead of their generative ability.
Pangu enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.
arXiv Detail & Related papers (2022-12-19T18:55:21Z) - Evidence > Intuition: Transferability Estimation for Encoder Selection [16.490047604583882]
We generate quantitative evidence to predict which LM will perform best on a target task without having to fine-tune all candidates.
We adopt the state-of-the-art Logarithm Maximum of Evidence (LogME) measure from Computer Vision (CV) and find that it positively correlates with final LM performance in 94% of setups.
arXiv Detail & Related papers (2022-10-20T13:25:21Z) - Uncertainty Quantification with Pre-trained Language Models: A
Large-Scale Empirical Analysis [120.9545643534454]
It is crucial for the pipeline to minimize the calibration error, especially in safety-critical applications.
There are various considerations behind the pipeline: (1) the choice and (2) the size of PLM, (3) the choice of uncertainty quantifier, (4) the choice of fine-tuning loss, and many more.
In response, we recommend the following: (1) use ELECTRA for PLM encoding, (2) use larger PLMs if possible, (3) use Temp Scaling as the uncertainty quantifier, and (4) use Focal Loss for fine-tuning.
arXiv Detail & Related papers (2022-10-10T14:16:01Z) - Sort by Structure: Language Model Ranking as Dependency Probing [25.723591566201343]
Making an informed choice of pre-trained language model (LM) is critical for performance, yet environmentally costly, and as such widely underexplored.
We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM's contextualized embeddings.
Across 46 typologically and architecturally diverse LM-language pairs, our approach predicts the best LM choice of 79% of orders of less compute than training a full magnitude of orders of less compute.
arXiv Detail & Related papers (2022-06-10T08:10:29Z) - Language Model Prior for Low-Resource Neural Machine Translation [85.55729693003829]
We propose a novel approach to incorporate a LM as prior in a neural translation model (TM)
We add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior.
Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.
arXiv Detail & Related papers (2020-04-30T16:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.