Multi-Intent Detection in User Provided Annotations for Programming by
Examples Systems
- URL: http://arxiv.org/abs/2307.03966v1
- Date: Sat, 8 Jul 2023 12:35:10 GMT
- Title: Multi-Intent Detection in User Provided Annotations for Programming by
Examples Systems
- Authors: Nischal Ashok Kumar, Nitin Gupta, Shanmukha Guttula, Hima Patel
- Abstract summary: Programming by Example (PBE) is a technique that targets automatic inferencing of a computer program to accomplish a format or string conversion task from user-provided input and output samples.
In this paper, we propose a deep neural network based ambiguity prediction model, which analyzes the input-output strings and maps them to a different set of properties responsible for multiple intent.
- Score: 3.265146857386153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In mapping enterprise applications, data mapping remains a fundamental part
of integration development, but its time consuming. An increasing number of
applications lack naming standards, and nested field structures further add
complexity for the integration developers. Once the mapping is done, data
transformation is the next challenge for the users since each application
expects data to be in a certain format. Also, while building integration flow,
developers need to understand the format of the source and target data field
and come up with transformation program that can change data from source to
target format. The problem of automatic generation of a transformation program
through program synthesis paradigm from some specifications has been studied
since the early days of Artificial Intelligence (AI). Programming by Example
(PBE) is one such kind of technique that targets automatic inferencing of a
computer program to accomplish a format or string conversion task from
user-provided input and output samples. To learn the correct intent, a diverse
set of samples from the user is required. However, there is a possibility that
the user fails to provide a diverse set of samples. This can lead to multiple
intents or ambiguity in the input and output samples. Hence, PBE systems can
get confused in generating the correct intent program. In this paper, we
propose a deep neural network based ambiguity prediction model, which analyzes
the input-output strings and maps them to a different set of properties
responsible for multiple intent. Users can analyze these properties and
accordingly can provide new samples or modify existing samples which can help
in building a better PBE system for mapping enterprise applications.
Related papers
- Quantitative Assurance and Synthesis of Controllers from Activity
Diagrams [4.419843514606336]
Probabilistic model checking is a widely used formal verification technique to automatically verify qualitative and quantitative properties.
This makes it not accessible for researchers and engineers who may not have the required knowledge.
We propose a comprehensive verification framework for ADs, including a new profile for probability time, quality annotations, a semantics interpretation of ADs in three Markov models, and a set of transformation rules from activity diagrams to the PRISM language.
Most importantly, we developed algorithms for transformation and implemented them in a tool, called QASCAD, using model-based techniques, for fully automated verification.
arXiv Detail & Related papers (2024-02-29T22:40:39Z) - Modelling Concurrency Bugs Using Machine Learning [0.0]
This project aims to compare both common and recent machine learning approaches.
We define a synthetic dataset that we generate with the scope of simulating real-life (concurrent) programs.
We formulate hypotheses about fundamental limits of various machine learning model types.
arXiv Detail & Related papers (2023-05-08T17:30:24Z) - PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels [59.66777287810985]
We introduce information-theoretic scores for privacy and utility, which quantify the average performance of an unfaithful user.
We then theoretically characterize primitives in building families of encoding schemes that motivate the use of random deep neural networks.
arXiv Detail & Related papers (2023-03-31T18:03:53Z) - Dataset Interfaces: Diagnosing Model Failures Using Controllable
Counterfactual Generation [85.13934713535527]
Distribution shift is a major source of failure for machine learning models.
We introduce the notion of a dataset interface: a framework that, given an input dataset and a user-specified shift, returns instances that exhibit the desired shift.
We demonstrate how applying this dataset interface to the ImageNet dataset enables studying model behavior across a diverse array of distribution shifts.
arXiv Detail & Related papers (2023-02-15T18:56:26Z) - EGG-GAE: scalable graph neural networks for tabular data imputation [8.775728170359024]
We propose a novel EdGe Generation Graph AutoEncoder (EGG-GAE) for missing data imputation.
EGG-GAE works on randomly sampled mini-batches of the input data, and it automatically infers the best connectivity across the mini-batch for each architecture layer.
arXiv Detail & Related papers (2022-10-19T10:26:17Z) - Conditional Generation with a Question-Answering Blueprint [84.95981645040281]
We advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded.
We obtain blueprints automatically by exploiting state-of-the-art question generation technology.
We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output.
arXiv Detail & Related papers (2022-07-01T13:10:19Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Visual Neural Decomposition to Explain Multivariate Data Sets [13.117139248511783]
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers.
We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
arXiv Detail & Related papers (2020-09-11T15:53:37Z) - Information-theoretic User Interaction: Significant Inputs for Program
Synthesis [11.473616777800318]
We introduce the em significant questions problem, and show that it is hard in general.
We develop an information-theoretic greedy approach for solving the problem.
In the context of interactive program synthesis, we use the above result to develop an emactive program learner
Our active learner is able to tradeoff false negatives for false positives and converge in a small number of iterations on a real-world dataset.
arXiv Detail & Related papers (2020-06-22T21:46:40Z) - Synthetic Datasets for Neural Program Synthesis [66.20924952964117]
We propose a new methodology for controlling and evaluating the bias of synthetic data distributions over both programs and specifications.
We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
arXiv Detail & Related papers (2019-12-27T21:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.