ArcheType: A Novel Framework for Open-Source Column Type Annotation
using Large Language Models
- URL: http://arxiv.org/abs/2310.18208v2
- Date: Mon, 6 Nov 2023 13:16:27 GMT
- Title: ArcheType: A Novel Framework for Open-Source Column Type Annotation
using Large Language Models
- Authors: Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire
- Abstract summary: We introduce ArcheType, a simple, practical method for context sampling, prompt serialization, model querying, and label remapping.
We establish a new state-of-the-art performance on zero-shot CTA benchmarks.
- Score: 27.16599463833913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing deep-learning approaches to semantic column type annotation (CTA)
have important shortcomings: they rely on semantic types which are fixed at
training time; require a large number of training samples per type and incur
large run-time inference costs; and their performance can degrade when
evaluated on novel datasets, even when types remain constant. Large language
models have exhibited strong zero-shot classification performance on a wide
range of tasks and in this paper we explore their use for CTA. We introduce
ArcheType, a simple, practical method for context sampling, prompt
serialization, model querying, and label remapping, which enables large
language models to solve CTA problems in a fully zero-shot manner. We ablate
each component of our method separately, and establish that improvements to
context sampling and label remapping provide the most consistent gains.
ArcheType establishes a new state-of-the-art performance on zero-shot CTA
benchmarks (including three new domain-specific benchmarks which we release
along with this paper), and when used in conjunction with classical CTA
techniques, it outperforms a SOTA DoDuo model on the fine-tuned SOTAB
benchmark. Our code is available at https://github.com/penfever/ArcheType.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine
Entity Typing [10.08153231108538]
We present CASENT, a seq2seq model designed for ultra-fine entity typing.
Our model takes an entity mention as input and employs constrained beam search to generate multiple types autoregressively.
Our method outperforms the previous state-of-the-art in terms of F1 score and calibration error, while achieving an inference speedup of over 50 times.
arXiv Detail & Related papers (2023-11-01T20:39:12Z) - In-Context Learning for Text Classification with Many Labels [34.87532045406169]
In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window.
We use a pre-trained dense retrieval model to bypass this limitation.
We analyze the performance across number of in-context examples and different model scales.
arXiv Detail & Related papers (2023-09-19T22:41:44Z) - TabLLM: Few-shot Classification of Tabular Data with Large Language
Models [66.03023402174138]
We study the application of large language models to zero-shot and few-shot classification.
We evaluate several serialization methods including templates, table-to-text models, and large language models.
This approach is also competitive with strong traditional baselines like gradient-boosted trees.
arXiv Detail & Related papers (2022-10-19T17:08:13Z) - Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation
and Instance Generation [36.541309948222306]
We study the problem of few-shot Fine-grained Entity Typing (FET), where only a few annotated entity mentions with contexts are given for each entity type.
We propose a novel framework for few-shot FET consisting of two modules: (1) an entity type label interpretation module automatically learns to relate type labels to the vocabulary by jointly leveraging few-shot instances and the label hierarchy, and (2) a type-based contextualized instance generator produces new instances based on given instances to enlarge the training set for better generalization.
arXiv Detail & Related papers (2022-06-28T04:05:40Z) - UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query.
Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms.
We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.