Evaluating Prompt-based Question Answering for Object Prediction in the
Open Research Knowledge Graph
- URL: http://arxiv.org/abs/2305.12900v2
- Date: Sun, 11 Jun 2023 08:23:23 GMT
- Title: Evaluating Prompt-based Question Answering for Object Prediction in the
Open Research Knowledge Graph
- Authors: Jennifer D'Souza, Moussab Hrou and S\"oren Auer
- Abstract summary: This work reports results on adopting prompt-based training of transformers for textitscholarly knowledge graph object prediction
It deviates from the other works proposing entity and relation extraction pipelines for predicting objects of a scholarly knowledge graph.
We find that (i) per expectations, transformer models when tested out-of-the-box underperform on a new domain of data, (ii) prompt-based training of the models achieve performance boosts of up to 40% in a relaxed evaluation setting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: There have been many recent investigations into prompt-based training of
transformer language models for new text genres in low-resource settings. The
prompt-based training approach has been found to be effective in generalizing
pre-trained or fine-tuned models for transfer to resource-scarce settings. This
work, for the first time, reports results on adopting prompt-based training of
transformers for \textit{scholarly knowledge graph object prediction}. The work
is unique in the following two main aspects. 1) It deviates from the other
works proposing entity and relation extraction pipelines for predicting objects
of a scholarly knowledge graph. 2) While other works have tested the method on
text genera relatively close to the general knowledge domain, we test the
method for a significantly different domain, i.e. scholarly knowledge, in turn
testing the linguistic, probabilistic, and factual generalizability of these
large-scale transformer models. We find that (i) per expectations, transformer
models when tested out-of-the-box underperform on a new domain of data, (ii)
prompt-based training of the models achieve performance boosts of up to 40\% in
a relaxed evaluation setting, and (iii) testing the models on a starkly
different domain even with a clever training objective in a low resource
setting makes evident the domain knowledge capture gap offering an
empirically-verified incentive for investing more attention and resources to
the scholarly domain in the context of transformer models.
Related papers
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - QAGAN: Adversarial Approach To Learning Domain Invariant Language
Features [0.76146285961466]
We explore adversarial training approach towards learning domain-invariant features.
We are able to achieve $15.2%$ improvement in EM score and $5.6%$ boost in F1 score on out-of-domain validation dataset.
arXiv Detail & Related papers (2022-06-24T17:42:18Z) - Transformer Uncertainty Estimation with Hierarchical Stochastic
Attention [8.95459272947319]
We propose a novel way to enable transformers to have the capability of uncertainty estimation.
This is achieved by learning a hierarchical self-attention that attends to values and a set of learnable centroids.
We empirically evaluate our model on two text classification tasks with both in-domain (ID) and out-of-domain (OOD) datasets.
arXiv Detail & Related papers (2021-12-27T16:43:31Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Inserting Information Bottlenecks for Attribution in Transformers [46.77580577396633]
We apply information bottlenecks to analyze the attribution of each feature for prediction on a black-box model.
We show the effectiveness of our method in terms of attribution and the ability to provide insight into how information flows through layers.
arXiv Detail & Related papers (2020-12-27T00:35:43Z) - Transformer Based Multi-Source Domain Adaptation [53.24606510691877]
In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than the data it was trained on.
Here, we investigate the problem of unsupervised multi-source domain adaptation, where a model is trained on labelled data from multiple source domains and must make predictions on a domain for which no labelled data has been seen.
We show that the predictions of large pretrained transformer based domain experts are highly homogenous, making it challenging to learn effective functions for mixing their predictions.
arXiv Detail & Related papers (2020-09-16T16:56:23Z) - Gradient-Based Adversarial Training on Transformer Networks for
Detecting Check-Worthy Factual Claims [3.7543966923106438]
We introduce the first adversarially-regularized, transformer-based claim spotter model.
We obtain a 4.70 point F1-score improvement over current state-of-the-art models.
We propose a method to apply adversarial training to transformer models.
arXiv Detail & Related papers (2020-02-18T16:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.