Probing Pretrained Language Models with Hierarchy Properties
- URL: http://arxiv.org/abs/2312.09670v1
- Date: Fri, 15 Dec 2023 10:31:36 GMT
- Title: Probing Pretrained Language Models with Hierarchy Properties
- Authors: Jes\'us Lov\'on-Melgarejo, Jose G. Moreno, Romaric Besan\c{c}on,
Olivier Ferret, Lynda Tamine
- Abstract summary: We propose a task-agnostic evaluation method able to evaluate to what extent PLMs can capture complex taxonomy relations.
We show that the proposed properties can be injected into PLMs to improve their understanding of hierarchy.
- Score: 3.9694958595022376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since Pretrained Language Models (PLMs) are the cornerstone of the most
recent Information Retrieval (IR) models, the way they encode semantic
knowledge is particularly important. However, little attention has been given
to studying the PLMs' capability to capture hierarchical semantic knowledge.
Traditionally, evaluating such knowledge encoded in PLMs relies on their
performance on a task-dependent evaluation approach based on proxy tasks, such
as hypernymy detection. Unfortunately, this approach potentially ignores other
implicit and complex taxonomic relations. In this work, we propose a
task-agnostic evaluation method able to evaluate to what extent PLMs can
capture complex taxonomy relations, such as ancestors and siblings. The
evaluation is based on intrinsic properties that capture the hierarchical
nature of taxonomies. Our experimental evaluation shows that the
lexico-semantic knowledge implicitly encoded in PLMs does not always capture
hierarchical relations. We further demonstrate that the proposed properties can
be injected into PLMs to improve their understanding of hierarchy. Through
evaluations on taxonomy reconstruction, hypernym discovery and reading
comprehension tasks, we show that the knowledge about hierarchy is moderately
but not systematically transferable across tasks.
Related papers
- Dynamic Evaluation of Large Language Models by Meta Probing Agents [44.20074234421295]
We propose meta probing agents (MPA) to evaluate large language models (LLMs)
MPA is the key component of DyVal 2, which naturally extends the previous DyValcitepzhu2023dyval.
MPA designs the probing and judging agents to automatically transform an original evaluation problem into a new one following psychometric theory.
arXiv Detail & Related papers (2024-02-21T06:46:34Z) - Tree-Based Hard Attention with Self-Motivation for Large Language Models [7.2677650379517775]
Large language models (LLMs) excel at understanding and generating plain text.
They are not specifically tailored to handle hierarchical text structures.
We propose a novel framework called Tree-Based Hard Attention with Self-Motivation for Large Language Models.
arXiv Detail & Related papers (2024-02-14T00:40:51Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Don't Judge a Language Model by Its Last Layer: Contrastive Learning
with Layer-Wise Attention Pooling [6.501126898523172]
Recent pre-trained language models (PLMs) achieved great success on many natural language processing tasks through learning linguistic features and contextualized sentence representation.
This paper introduces the attention-based pooling strategy, which enables the model to preserve layer-wise signals captured in each layer and learn digested linguistic features for downstream tasks.
arXiv Detail & Related papers (2022-09-13T13:09:49Z) - Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts.
To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts.
However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.