Related papers: TART: A plug-and-play Transformer module for task-agnostic reasoning

TART: A plug-and-play Transformer module for task-agnostic reasoning

URL: http://arxiv.org/abs/2306.07536v1
Date: Tue, 13 Jun 2023 04:37:00 GMT
Title: TART: A plug-and-play Transformer module for task-agnostic reasoning
Authors: Kush Bhatia, Avanika Narayan, Christopher De Sa, Christopher R\'e
Abstract summary: Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. Traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. We propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module.
Score: 38.84903599406189
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. In contrast, traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. In-context learning, however, consistently underperforms task-specific tuning approaches even when presented with the same examples. While most existing approaches (e.g., prompt engineering) focus on the LLM's learned representations to patch this performance gap, our analysis actually reveal that LLM representations contain sufficient information to make good predictions. As such, we focus on the LLM's reasoning abilities and demonstrate that this performance gap exists due to their inability to perform simple probabilistic reasoning tasks. This raises an intriguing question: Are LLMs actually capable of learning how to reason in a task-agnostic manner? We answer this in the affirmative and propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module. TART trains this reasoning module in a task-agnostic manner using only synthetic logistic regression tasks and composes it with an arbitrary real-world pre-trained model without any additional training. With a single inference module, TART improves performance across different model families (GPT-Neo, Pythia, BLOOM), model sizes (100M - 6B), tasks (14 NLP binary classification tasks), and even across different modalities (audio and vision). Additionally, on the RAFT Benchmark, TART improves GPT-Neo (125M)'s performance such that it outperforms BLOOM (176B), and is within 4% of GPT-3 (175B). Our code and models are available at https://github.com/HazyResearch/TART .

Related papers

Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing [64.38277118982698]
Large Language Models (LLMs) have demonstrated human-like instruction-following abilities. In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance. We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance.
arXiv Detail & Related papers (2025-02-24T16:10:53Z)
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling [69.57918638435491]
Test-Time Scaling is an important method for improving the performance of Large Language Models. This paper focuses on two core questions: What is the optimal approach to scale test-time computation across different policy models, PRMs, and problem difficulty levels? We show that with our compute-optimal TTS strategy, extremely small policy models can outperform larger models.
arXiv Detail & Related papers (2025-02-10T17:30:23Z)
Probing the Robustness of Theory of Mind in Large Language Models [6.7932860553262415]
We introduce a novel dataset of 68 tasks for probing ToM in LLMs. We evaluate the ToM performance of four SotA open source LLMs on our dataset and the dataset introduced by (Kosinski, 2023) We find a consistent tendency in all tested LLMs to perform poorly on tasks that require the realization that an agent has knowledge of automatic state changes in its environment.
arXiv Detail & Related papers (2024-10-08T18:13:27Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models. Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapting the previous prompt-based representation method for autoregressive models. By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z)
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models [76.410400238974]
We propose TTA with feedback to rectify the model output and prevent the model from becoming blindly confident. A CLIP model is adopted as the reward model during TTA and provides feedback for the VLM. The proposed textitreinforcement learning with CLIP feedback(RLCF) framework is highly flexible and universal.
arXiv Detail & Related papers (2023-05-29T11:03:59Z)
Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes) TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task. The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.