Data-Driven and SE-assisted AI Model Signal-Awareness Enhancement and
Introspection
- URL: http://arxiv.org/abs/2111.05827v1
- Date: Wed, 10 Nov 2021 17:58:18 GMT
- Title: Data-Driven and SE-assisted AI Model Signal-Awareness Enhancement and
Introspection
- Authors: Sahil Suneja, Yufan Zhuang, Yunhui Zheng, Jim Laredo, Alessandro
Morari
- Abstract summary: We propose a data-driven approach to enhance models' signal-awareness.
We combine the SE concept of code complexity with the AI technique of curriculum learning.
We achieve up to 4.8x improvement in model signal awareness.
- Score: 61.571331422347875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI modeling for source code understanding tasks has been making significant
progress, and is being adopted in production development pipelines. However,
reliability concerns, especially whether the models are actually learning
task-related aspects of source code, are being raised. While recent
model-probing approaches have observed a lack of signal awareness in many
AI-for-code models, i.e. models not capturing task-relevant signals, they do
not offer solutions to rectify this problem. In this paper, we explore
data-driven approaches to enhance models' signal-awareness: 1) we combine the
SE concept of code complexity with the AI technique of curriculum learning; 2)
we incorporate SE assistance into AI models by customizing Delta Debugging to
generate simplified signal-preserving programs, augmenting them to the training
dataset. With our techniques, we achieve up to 4.8x improvement in model signal
awareness. Using the notion of code complexity, we further present a novel
model learning introspection approach from the perspective of the dataset.
Related papers
- Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI.
We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts.
We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z) - Toward Neurosymbolic Program Comprehension [46.874490406174644]
We advocate for a Neurosymbolic research direction that combines the strengths of existing DL techniques with traditional symbolic methods.
We present preliminary results for our envisioned approach, aimed at establishing the first Neurosymbolic Program framework.
arXiv Detail & Related papers (2025-02-03T20:38:58Z) - Next-Gen Software Engineering. Big Models for AI-Augmented Model-Driven Software Engineering [0.0]
The paper provides an overview of the current state of AI-augmented software engineering and develops a corresponding taxonomy, AI4SE.
A vision of AI-assisted Big Models in SE is put forth, with the aim of capitalising on the advantages inherent to both approaches in the context of software development.
arXiv Detail & Related papers (2024-09-26T16:49:57Z) - A comprehensible analysis of the efficacy of Ensemble Models for Bug
Prediction [0.0]
We present a comparison and analysis of the efficacy of two AI-based approaches, namely single AI models and ensemble AI models, for predicting the probability of a Java class being buggy.
Our experimental findings indicate that the ensemble of AI models can outperform the results of applying individual AI models.
arXiv Detail & Related papers (2023-10-18T17:43:54Z) - AlerTiger: Deep Learning for AI Model Health Monitoring at LinkedIn [4.020770981811131]
AlerTiger helps AI teams across the company monitor their AI models' health.
System consists of four major steps: model statistics generation, deep-learning-based anomaly detection, anomaly post-processing, and user alerting.
arXiv Detail & Related papers (2023-06-03T01:21:58Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - AI Model Utilization Measurements For Finding Class Encoding Patterns [2.702380921892937]
This work addresses the problems of designing utilization measurements of trained artificial intelligence (AI) models.
The problems are motivated by the lack of explainability of AI models in security and safety critical applications.
arXiv Detail & Related papers (2022-12-12T02:18:10Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.