Inroads to a Structured Data Natural Language Bijection and the role of
LLM annotation
- URL: http://arxiv.org/abs/2401.07190v1
- Date: Sun, 14 Jan 2024 03:16:49 GMT
- Title: Inroads to a Structured Data Natural Language Bijection and the role of
LLM annotation
- Authors: Blake Vente
- Abstract summary: This work finds limited evidence supporting the theory that using multiple tasks with sequence-to-sequence transformer language models can improve performance on some metrics.
The inverse task alone is likely only an optimization strategy, since it does not yield a significant general improvement at the model sizes explored in this work.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work finds limited evidence supporting the theory that using multiple
tasks with sequence-to-sequence transformer language models can improve
performance on some metrics. In particular, the multi-task generalist t5-small
outperforms the specialist t5-small with a $F_1$ of $0.771$ up from $0.692$,
which may point to underlying cross-task knowledge generalization. This further
suggests that even with the same network, "re-using" the same data in a
different way may lead to higher performance in some metrics. However, the
inverse task alone is likely only an optimization strategy, since it does not
yield a significant general improvement at the model sizes explored in this
work. Also, adding $\approx 4500$ LLM annotated records (interlaced with the
$12800$ WebNLG training records) does not substantially change automatic metric
performance compared to the same t5-small model without the synthetic data.
This may be due to a learning capacity bottleneck on account of model size, and
decreases observed may be due to distributional differences in the corpora.
Future research using larger models or human evaluation is required to more
fully explain the mechanisms contributing to performance on these tasks.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Improving Large Models with Small models: Lower Costs and Better Performance [81.55672406002715]
We propose Data Shunt$+$ (DS$+$), a general paradigm for collaboration of small and large models.
For instance, ChatGPT achieves an accuracy of $94.43%$ on Amazon Product sentiment analysis, and DS$+$ achieves an accuracy of $95.64%$, while the cost has been reduced to only $31.18%$.
arXiv Detail & Related papers (2024-06-15T14:44:43Z) - GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks [0.0]
We will introduce a new kind of GLiNER model that can be used for various information extraction tasks while being a small encoder model.
Our model achieved SoTA performance on zero-shot NER benchmarks and leading performance on question-answering, summarization and relation extraction tasks.
arXiv Detail & Related papers (2024-06-14T13:54:29Z) - Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large
Language Models for Dynamic Inference [32.62084449979531]
We extend SortedNet to generative NLP tasks by replacing Standard Fine-Tuning (SFT) with Sorted Fine-Tuning (SoFT)
Our approach boosts model efficiency, eliminating the need for multiple models for various scenarios during inference.
Our results show the superior performance of sub-models in comparison to Standard Fine-Tuning and SFT+ICT (Early-Exit)
arXiv Detail & Related papers (2023-09-16T11:58:34Z) - Evaluating Instruction-Tuned Large Language Models on Code Comprehension
and Generation [4.310519298899164]
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks.
For the zero-shot setting, instructed LLMs are very competitive on code comprehension and generation tasks.
For the few-shot setting, we find that adding demonstration examples substantially helps instructed LLMs perform better.
arXiv Detail & Related papers (2023-08-02T15:54:22Z) - Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing.
LLMs are extremely computationally expensive, even at inference time.
We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z) - Distilling Step-by-Step! Outperforming Larger Language Models with Less
Training Data and Smaller Model Sizes [91.58845026796149]
We introduce Distilling step-by-step, a new mechanism that trains small models that outperform large language models.
We present three findings across 4 NLP benchmarks.
arXiv Detail & Related papers (2023-05-03T17:50:56Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.