Masked Part-Of-Speech Model: Does Modeling Long Context Help
Unsupervised POS-tagging?
- URL: http://arxiv.org/abs/2206.14969v1
- Date: Thu, 30 Jun 2022 01:43:05 GMT
- Title: Masked Part-Of-Speech Model: Does Modeling Long Context Help
Unsupervised POS-tagging?
- Authors: Xiang Zhou, Shiyue Zhang, Mohit Bansal
- Abstract summary: We propose a Masked Part-of-Speech Model (MPoSM) to facilitate flexible dependency modeling.
MPoSM can model arbitrary tag dependency and perform POS induction through the objective of masked POS reconstruction.
We achieve competitive results on both the English Penn WSJ dataset and the universal treebank containing 10 diverse languages.
- Score: 94.68962249604749
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous Part-Of-Speech (POS) induction models usually assume certain
independence assumptions (e.g., Markov, unidirectional, local dependency) that
do not hold in real languages. For example, the subject-verb agreement can be
both long-term and bidirectional. To facilitate flexible dependency modeling,
we propose a Masked Part-of-Speech Model (MPoSM), inspired by the recent
success of Masked Language Models (MLM). MPoSM can model arbitrary tag
dependency and perform POS induction through the objective of masked POS
reconstruction. We achieve competitive results on both the English Penn WSJ
dataset as well as the universal treebank containing 10 diverse languages.
Though modeling the long-term dependency should ideally help this task, our
ablation study shows mixed trends in different languages. To better understand
this phenomenon, we design a novel synthetic experiment that can specifically
diagnose the model's ability to learn tag agreement. Surprisingly, we find that
even strong baselines fail to solve this problem consistently in a very
simplified setting: the agreement between adjacent words. Nonetheless, MPoSM
achieves overall better performance. Lastly, we conduct a detailed error
analysis to shed light on other remaining challenges. Our code is available at
https://github.com/owenzx/MPoSM
Related papers
- Tokenization Impacts Multilingual Language Modeling: Assessing
Vocabulary Allocation and Overlap Across Languages [3.716965622352967]
We propose new criteria to evaluate the quality of lexical representation and vocabulary overlap observed in sub-word tokenizers.
Our findings show that the overlap of vocabulary across languages can be actually detrimental to certain downstream tasks.
arXiv Detail & Related papers (2023-05-26T18:06:49Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - Deanthropomorphising NLP: Can a Language Model Be Conscious? [19.390107818044736]
We take the position that such a large language model cannot be sentient, or conscious, and that LaMDA in particular exhibits no advances over other similar models that would qualify it.
We see the claims of sentience as part of a wider tendency to use anthropomorphic language in NLP reporting.
arXiv Detail & Related papers (2022-11-21T14:18:25Z) - Structured, flexible, and robust: benchmarking and improving large
language models towards more human-like behavior in out-of-distribution
reasoning tasks [39.39138995087475]
We ask how much of human-like thinking can be captured by learning statistical patterns in language alone.
Our benchmark contains two problem-solving domains (planning and explanation generation) and is designed to require generalization.
We find that humans are far more robust than LLMs on this benchmark.
arXiv Detail & Related papers (2022-05-11T18:14:33Z) - Contrastive Learning for Prompt-Based Few-Shot Language Learners [14.244787327283335]
We present a contrastive learning framework that clusters inputs from the same class under different augmented "views"
We create different "views" of an example by appending it with different language prompts and contextual demonstrations.
Our method can improve over the state-of-the-art methods in a diverse set of 15 language tasks.
arXiv Detail & Related papers (2022-05-03T04:56:45Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Read Like Humans: Autonomous, Bidirectional and Iterative Language
Modeling for Scene Text Recognition [80.446770909975]
Linguistic knowledge is of great benefit to scene text recognition.
How to effectively model linguistic rules in end-to-end deep networks remains a research challenge.
We propose an autonomous, bidirectional and iterative ABINet for scene text recognition.
arXiv Detail & Related papers (2021-03-11T06:47:45Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle [88.65264818967489]
We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
arXiv Detail & Related papers (2020-10-21T17:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.