CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code
Models
- URL: http://arxiv.org/abs/2302.04030v2
- Date: Fri, 10 Feb 2023 06:57:49 GMT
- Title: CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code
Models
- Authors: Changan Niu, Chuanyi Li, Vincent Ng, Bin Luo
- Abstract summary: Cross-task generalization is of strong research and application value.
We propose a large-scale benchmark that includes 216 existing code-related tasks.
- Score: 33.78307982736911
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the recent advances showing that a model pre-trained on large-scale
source code data is able to gain appreciable generalization capability, it
still requires a sizeable amount of data on the target task for fine-tuning.
And the effectiveness of the model generalization is largely affected by the
size and quality of the fine-tuning data, which is detrimental for target tasks
with limited or unavailable resources. Therefore, cross-task generalization,
with the goal of improving the generalization of the model to unseen tasks that
have not been seen before, is of strong research and application value.
In this paper, we propose a large-scale benchmark that includes 216 existing
code-related tasks. Then, we annotate each task with the corresponding meta
information such as task description and instruction, which contains detailed
information about the task and a solution guide. This also helps us to easily
create a wide variety of ``training/evaluation'' task splits to evaluate the
various cross-task generalization capabilities of the model. Then we perform
some preliminary experiments to demonstrate that the cross-task generalization
of models can be largely improved by in-context learning methods such as
few-shot learning and learning from task instructions, which shows the
promising prospects of conducting cross-task learning research on our
benchmark. We hope that the collection of the datasets and our benchmark will
facilitate future work that is not limited to cross-task generalization.
Related papers
- Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Identification of Negative Transfers in Multitask Learning Using
Surrogate Models [29.882265735630046]
Multitask learning is widely used to train a low-resource target task by augmenting it with multiple related source tasks.
A critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task.
We introduce an efficient procedure to address this problem via surrogate modeling.
arXiv Detail & Related papers (2023-03-25T23:16:11Z) - Prototype-guided Cross-task Knowledge Distillation for Large-scale
Models [103.04711721343278]
Cross-task knowledge distillation helps to train a small student model to obtain a competitive performance.
We propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
arXiv Detail & Related papers (2022-12-26T15:00:42Z) - Generalization with Lossy Affordances: Leveraging Broad Offline Data for
Learning Visuomotor Tasks [65.23947618404046]
We introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data.
When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems.
We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.
arXiv Detail & Related papers (2022-10-12T21:46:38Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Unsupervised Cross-Task Generalization via Retrieval Augmentation [27.47782160720298]
We propose a retrieval-augmentation method named ReCross that takes a few unlabelled examples as queries to retrieve a small subset of upstream data.
Our empirical results show that the proposed ReCross consistently outperforms non-retrieval baselines by a significant margin.
arXiv Detail & Related papers (2022-04-17T06:05:13Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Generalized Hindsight for Reinforcement Learning [154.0545226284078]
We argue that low-reward data collected while trying to solve one task provides little to no signal for solving that particular task.
We present Generalized Hindsight: an approximate inverse reinforcement learning technique for relabeling behaviors with the right tasks.
arXiv Detail & Related papers (2020-02-26T18:57:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.