Memorization and Generalization in Neural Code Intelligence Models
- URL: http://arxiv.org/abs/2106.08704v1
- Date: Wed, 16 Jun 2021 11:11:41 GMT
- Title: Memorization and Generalization in Neural Code Intelligence Models
- Authors: Md Rafiqul Islam Rabin, Aftab Hussain, Vincent J. Hellendoorn and
Mohammad Amin Alipour
- Abstract summary: We evaluate the memorization and generalization tendencies in neural code intelligence models through a case study.
Our results shed light on the impact of noisy dataset in training.
- Score: 3.6245424131171813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNN) are increasingly commonly used in software
engineering and code intelligence tasks. These are powerful tools that are
capable of learning highly generalizable patterns from large datasets through
millions of parameters. At the same time, training DNNs means walking a knife's
edges, because their large capacity also renders them prone to memorizing data
points. While traditionally thought of as an aspect of over-training, recent
work suggests that the memorization risk manifests especially strongly when the
training datasets are noisy and memorization is the only recourse.
Unfortunately, most code intelligence tasks rely on rather noise-prone and
repetitive data sources, such as GitHub, which, due to their sheer size, cannot
be manually inspected and evaluated. We evaluate the memorization and
generalization tendencies in neural code intelligence models through a case
study across several benchmarks and model families by leveraging established
approaches from other fields that use DNNs, such as introducing targeted noise
into the training dataset. In addition to reinforcing prior general findings
about the extent of memorization in DNNs, our results shed light on the impact
of noisy dataset in training.
Related papers
- Are Sparse Neural Networks Better Hard Sample Learners? [24.2141078613549]
Hard samples play a crucial role in the optimal performance of deep neural networks.
Most SNNs trained on challenging samples can often match or surpass dense models in accuracy at certain sparsity levels.
arXiv Detail & Related papers (2024-09-13T21:12:18Z) - Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency.
This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z) - The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning [3.649801602551928]
We develop a set of neuroscience-inspired self-supervised objectives, together with a neural architecture, for representation learning from heterogeneous recordings.
Results show that representations learned with these objectives scale with data, generalise across subjects, datasets, and tasks, and surpass comparable self-supervised approaches.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Deep Learning for real-time neural decoding of grasp [0.0]
We present a Deep Learning-based approach to the decoding of neural signals for grasp type classification.
The main goal of the presented approach is to improve over state-of-the-art decoding accuracy without relying on any prior neuroscience knowledge.
arXiv Detail & Related papers (2023-11-02T08:26:29Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - Measures of Information Reflect Memorization Patterns [53.71420125627608]
We show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization.
Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples.
arXiv Detail & Related papers (2022-10-17T20:15:24Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - Embedding Symbolic Temporal Knowledge into Deep Sequential Models [21.45383857094518]
Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning.
Deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources.
We construct semantic-based embeddings of automata generated from formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.
arXiv Detail & Related papers (2021-01-28T13:17:46Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.