Efficient Retrieval Optimized Multi-task Learning
- URL: http://arxiv.org/abs/2104.10129v1
- Date: Tue, 20 Apr 2021 17:16:34 GMT
- Title: Efficient Retrieval Optimized Multi-task Learning
- Authors: Hengxin Fun, Sunil Gandhi, Sujith Ravi
- Abstract summary: We propose a novel Retrieval Optimized Multi-task (ROM) framework for jointly training self-supervised tasks, knowledge retrieval, and extractive question answering.
Our ROM approach presents a unified and generalizable framework that enables scaling efficiently to multiple tasks.
Using our framework, we achieve comparable or better performance than recent methods on QA, while drastically reducing the number of parameters.
- Score: 16.189136169520424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, there have been significant advances in neural methods for tackling
knowledge-intensive tasks such as open domain question answering (QA). These
advances are fueled by combining large pre-trained language models with
learnable retrieval of documents. Majority of these models use separate
encoders for learning query representation, passage representation for the
retriever and an additional encoder for the downstream task. Using separate
encoders for each stage/task occupies a lot of memory and makes it difficult to
scale to a large number of tasks. In this paper, we propose a novel Retrieval
Optimized Multi-task (ROM) framework for jointly training self-supervised
tasks, knowledge retrieval, and extractive question answering. Our ROM approach
presents a unified and generalizable framework that enables scaling efficiently
to multiple tasks, varying levels of supervision, and optimization choices such
as different learning schedules without changing the model architecture. It
also provides the flexibility of changing the encoders without changing the
architecture of the system. Using our framework, we achieve comparable or
better performance than recent methods on QA, while drastically reducing the
number of parameters.
Related papers
- Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula [23.071384759427072]
State space models (SSMs) offer advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts.
We propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture.
arXiv Detail & Related papers (2024-11-01T21:01:13Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - Scalarization for Multi-Task and Multi-Domain Learning at Scale [15.545810422759295]
Training a single model on multiple input domains and/or output tasks allows for compressing information from multiple sources into a unified backbone.
However, optimizing such networks is a challenge due to discrepancies between the different tasks or domains.
arXiv Detail & Related papers (2023-10-13T07:31:04Z) - MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks [59.09343552273045]
We propose a decoder-only model for multimodal tasks, which is surprisingly effective in jointly learning of these disparate vision-language tasks.
We demonstrate that joint learning of these diverse objectives is simple, effective, and maximizes the weight-sharing of the model across these tasks.
Our model achieves the state of the art on image-text and text-image retrieval, video question answering and open-vocabulary detection tasks, outperforming much larger and more extensively trained foundational models.
arXiv Detail & Related papers (2023-03-29T16:42:30Z) - MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are
Better Dense Retrievers [140.0479479231558]
In this work, we aim to unify a variety of pre-training tasks into a multi-task pre-trained model, namely MASTER.
MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors.
arXiv Detail & Related papers (2022-12-15T13:57:07Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation [80.18830380517753]
We develop a new task-agnostic distillation framework XtremeDistilTransformers.
We study the transferability of several source tasks, augmentation resources and model architecture for distillation.
arXiv Detail & Related papers (2021-06-08T17:49:33Z) - MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [103.7609761511652]
We show how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously.
New tasks can be continuously instantiated from previously learned tasks.
We train and evaluate our system on a set of 12 real-world tasks with data collected from 7 robots.
arXiv Detail & Related papers (2021-04-16T16:38:02Z) - Multi-task Retrieval for Knowledge-Intensive Tasks [21.725935960568027]
We propose a multi-task trained model for neural retrieval.
Our approach not only outperforms previous methods in the few-shot setting, but also rivals specialised neural retrievers.
With the help of our retriever, we improve existing models for downstream tasks and closely match or improve the state of the art on multiple benchmarks.
arXiv Detail & Related papers (2021-01-01T00:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.