Prompting Language Models for Linguistic Structure
- URL: http://arxiv.org/abs/2211.07830v2
- Date: Sun, 21 May 2023 01:52:45 GMT
- Title: Prompting Language Models for Linguistic Structure
- Authors: Terra Blevins and Hila Gonen and Luke Zettlemoyer
- Abstract summary: We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
- Score: 73.11488464916668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although pretrained language models (PLMs) can be prompted to perform a wide
range of language tasks, it remains an open question how much this ability
comes from generalizable linguistic understanding versus surface-level lexical
patterns. To test this, we present a structured prompting approach for
linguistic structured prediction tasks, allowing us to perform zero- and
few-shot sequence tagging with autoregressive PLMs. We evaluate this approach
on part-of-speech tagging, named entity recognition, and sentence chunking,
demonstrating strong few-shot performance in all cases. We also find that while
PLMs contain significant prior knowledge of task labels due to task leakage
into the pretraining corpus, structured prompting can also retrieve linguistic
structure with arbitrary labels. These findings indicate that the in-context
learning ability and linguistic knowledge of PLMs generalizes beyond
memorization of their training data.
Related papers
- Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data [84.01401439030265]
Recent end-to-end speech language models (SLMs) have expanded upon the capabilities of large language models (LLMs)
We present a simple yet effective automatic process for creating speech-text pair data.
Our model demonstrates general capabilities for speech-related tasks without the need for speech instruction-tuning data.
arXiv Detail & Related papers (2024-09-30T07:01:21Z) - DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment [82.86363991170546]
We propose a Descriptive Speech-Text Alignment approach that leverages speech captioning to bridge the gap between speech and text modalities.
Our model demonstrates superior performance on the Dynamic-SUPERB benchmark, particularly in generalizing to unseen tasks.
These findings highlight the potential to reshape instruction-following SLMs by incorporating descriptive rich, speech captions.
arXiv Detail & Related papers (2024-06-27T03:52:35Z) - A Hard Nut to Crack: Idiom Detection with Conversational Large Language Models [2.02990044704201]
We introduce theatic language Test Suite IdioTS, a new dataset designed by language experts to assess the capabilities of Large Language Models (LLMs) to process figurative language at sentence level.
We propose a comprehensive evaluation methodology based on an idiom detection task, where LLMs are prompted with detecting an idiomatic expression in a given English sentence.
We present a thorough automatic and manual evaluation of the results and an extensive error analysis.
arXiv Detail & Related papers (2024-05-17T07:08:13Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Interpretable Unified Language Checking [42.816372695828306]
We present an interpretable, unified, language checking (UniLC) method for both human and machine-generated language.
We find that LLMs can achieve high performance on a combination of fact-checking, stereotype detection, and hate speech detection tasks.
arXiv Detail & Related papers (2023-04-07T16:47:49Z) - WLASL-LEX: a Dataset for Recognising Phonological Properties in American
Sign Language [2.814213966364155]
We build a large-scale dataset of American Sign Language signs annotated with six different phonological properties.
We investigate whether data-driven end-to-end and feature-based approaches can be optimised to automatically recognise these properties.
arXiv Detail & Related papers (2022-03-11T17:21:24Z) - Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions.
We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks.
In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z) - Prompt-Learning for Fine-Grained Entity Typing [40.983849729537795]
We investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
We propose a self-supervised strategy that carries out distribution-level optimization in prompt-learning to automatically summarize the information of entity types.
arXiv Detail & Related papers (2021-08-24T09:39:35Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.