Larger-Context Tagging: When and Why Does It Work?
- URL: http://arxiv.org/abs/2104.04434v1
- Date: Fri, 9 Apr 2021 15:35:30 GMT
- Title: Larger-Context Tagging: When and Why Does It Work?
- Authors: Jinlan Fu, Liangjing Feng, Qi Zhang, Xuanjing Huang and Pengfei Liu
- Abstract summary: We focus on investigating when and why the larger-context training, as a general strategy, can work.
We set up a testbed based on four tagging tasks and thirteen datasets.
- Score: 55.407651696813396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The development of neural networks and pretraining techniques has spawned
many sentence-level tagging systems that achieved superior performance on
typical benchmarks. However, a relatively less discussed topic is what if more
context information is introduced into current top-scoring tagging systems.
Although several existing works have attempted to shift tagging systems from
sentence-level to document-level, there is still no consensus conclusion about
when and why it works, which limits the applicability of the larger-context
approach in tagging tasks. In this paper, instead of pursuing a
state-of-the-art tagging system by architectural exploration, we focus on
investigating when and why the larger-context training, as a general strategy,
can work.
To this end, we conduct a thorough comparative study on four proposed
aggregators for context information collecting and present an attribute-aided
evaluation method to interpret the improvement brought by larger-context
training. Experimentally, we set up a testbed based on four tagging tasks and
thirteen datasets. Hopefully, our preliminary observations can deepen the
understanding of larger-context training and enlighten more follow-up works on
the use of contextual information.
Related papers
- Manual Verbalizer Enrichment for Few-Shot Text Classification [1.860409237919611]
acrshortmave is an approach for verbalizer construction by enrichment of class labels.
Our model achieves state-of-the-art results while using significantly fewer resources.
arXiv Detail & Related papers (2024-10-08T16:16:47Z) - A Controlled Study on Long Context Extension and Generalization in LLMs [85.4758128256142]
Broad textual understanding and in-context learning require language models that utilize full document contexts.
Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending models to handle long contexts.
We implement a controlled protocol for extension methods with a standardized evaluation, utilizing consistent base models and extension data.
arXiv Detail & Related papers (2024-09-18T17:53:17Z) - Where does In-context Translation Happen in Large Language Models [18.379840329713407]
We characterize the region where large language models transition from in-text learners to translation models.
We demonstrate evidence of a "task recognition" point where the translation task is encoded into the input representations and attention to context is no longer necessary.
arXiv Detail & Related papers (2024-03-07T14:12:41Z) - How Can Context Help? Exploring Joint Retrieval of Passage and
Personalized Context [39.334509280777425]
Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval.
We propose a novel approach, Personalized Context-Aware Search (PCAS), that effectively harnesses contextual information during passage retrieval.
arXiv Detail & Related papers (2023-08-26T04:49:46Z) - In-Context Probing: Toward Building Robust Classifiers via Probing Large
Language Models [5.5089506884366735]
In this paper, we propose an alternative approach, which we term In-Context Probing (ICP)
Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label.
We show that ICP performs competitive or superior to finetuning and can be particularly helpful to build classifiers on top of smaller models.
arXiv Detail & Related papers (2023-05-23T15:43:04Z) - PRODIGY: Enabling In-context Learning Over Graphs [112.19056551153454]
In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks.
We develop PRODIGY, the first pretraining framework that enables in-context learning over graphs.
arXiv Detail & Related papers (2023-05-21T23:16:30Z) - Phrase Retrieval Learns Passage Retrieval, Too [77.57208968326422]
We study whether phrase retrieval can serve as the basis for coarse-level retrieval including passages and documents.
We show that a dense phrase-retrieval system, without any retraining, already achieves better passage retrieval accuracy.
We also show that phrase filtering and vector quantization can reduce the size of our index by 4-10x.
arXiv Detail & Related papers (2021-09-16T17:42:45Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - Quantifying the Contextualization of Word Representations with Semantic
Class Probing [8.401007663676214]
Pretrained language models have achieved a new state of the art on many NLP tasks, but there are still many open questions about how and why they work so well.
We quantify the amount of contextualization, i.e., how well words are interpreted in context, by studying the extent to which semantic classes of a word can be inferred from its contextualized embeddings.
arXiv Detail & Related papers (2020-04-25T17:49:37Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.