On-the-Fly Adaptation of Source Code Models using Meta-Learning
- URL: http://arxiv.org/abs/2003.11768v2
- Date: Sat, 19 Sep 2020 16:17:04 GMT
- Title: On-the-Fly Adaptation of Source Code Models using Meta-Learning
- Authors: Disha Shrivastava, Hugo Larochelle and Daniel Tarlow
- Abstract summary: We frame the problem of context adaptation as a meta-learning problem.
We train a base source code model that is best able to learn from information in a file to deliver improved predictions of missing tokens.
We demonstrate improved performance in experiments on a large scale Java GitHub corpus.
- Score: 28.98699307030983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to adapt to unseen, local contexts is an important challenge that
successful models of source code must overcome. One of the most popular
approaches for the adaptation of such models is dynamic evaluation. With
dynamic evaluation, when running a model on an unseen file, the model is
updated immediately after having observed each token in that file. In this
work, we propose instead to frame the problem of context adaptation as a
meta-learning problem. We aim to train a base source code model that is best
able to learn from information in a file to deliver improved predictions of
missing tokens. Unlike dynamic evaluation, this formulation allows us to select
more targeted information (support tokens) for adaptation, that is both before
and after a target hole in a file. We consider an evaluation setting that we
call line-level maintenance, designed to reflect the downstream task of code
auto-completion in an IDE. Leveraging recent developments in meta-learning such
as first-order MAML and Reptile, we demonstrate improved performance in
experiments on a large scale Java GitHub corpus, compared to other adaptation
baselines including dynamic evaluation. Moreover, our analysis shows that,
compared to a non-adaptive baseline, our approach improves performance on
identifiers and literals by 44\% and 15\%, respectively.
Related papers
- Adjusting Pretrained Backbones for Performativity [34.390793811659556]
We propose a novel technique to adjust pretrained backbones for performativity in a modular way.
We show how it leads to smaller loss along the retraining trajectory and enables us to effectively select among candidate models to anticipate performance degradations.
arXiv Detail & Related papers (2024-10-06T14:41:13Z) - Aligning Large Language Models via Fine-grained Supervision [20.35000061196631]
Pre-trained large-scale language models (LLMs) excel at producing coherent articles, yet their outputs may be untruthful, toxic, or fail to align with user expectations.
Current approaches focus on using reinforcement learning with human feedback to improve model alignment.
We propose a method to enhance LLM alignment through fine-grained token-level supervision.
arXiv Detail & Related papers (2024-06-04T20:21:45Z) - What, How, and When Should Object Detectors Update in Continually
Changing Test Domains? [34.13756022890991]
Test-time adaptation algorithms have been proposed to adapt the model online while inferring test data.
We propose a novel online adaption approach for object detection in continually changing test domains.
Our approach surpasses baselines on widely used benchmarks, achieving improvements of up to 4.9%p and 7.9%p in mAP.
arXiv Detail & Related papers (2023-12-12T07:13:08Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Evaluating Representations with Readout Model Switching [18.475866691786695]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric.
We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions.
The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z) - Meta-Learning Fast Weight Language Models [105.66999854213724]
We present Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently.
FWLs can be applied at training time so the model learns to make good use of gradient updates.
arXiv Detail & Related papers (2022-12-05T18:37:09Z) - Efficient Training of Language Models to Fill in the Middle [17.118891860985123]
We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset.
We use these ablations to prescribe strong default settings and best practices to train FIM models.
We have released our best infilling model trained with best practices in our API, and release our infilling benchmarks to aid future research.
arXiv Detail & Related papers (2022-07-28T17:40:47Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.