OptTyper: Probabilistic Type Inference by Optimising Logical and Natural
Constraints
- URL: http://arxiv.org/abs/2004.00348v3
- Date: Fri, 26 Mar 2021 19:17:53 GMT
- Title: OptTyper: Probabilistic Type Inference by Optimising Logical and Natural
Constraints
- Authors: Irene Vlassi Pandi, Earl T. Barr, Andrew D. Gordon, and Charles Sutton
- Abstract summary: We introduce a framework for probabilistic type inference that combines logic and learning.
We build a tool called OptTyper to predict missing types for TypeScript files.
- Score: 26.80183744947193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new approach to the type inference problem for dynamic
languages. Our goal is to combine \emph{logical} constraints, that is,
deterministic information from a type system, with \emph{natural} constraints,
that is, uncertain statistical information about types learnt from sources like
identifier names. To this end, we introduce a framework for probabilistic type
inference that combines logic and learning: logical constraints on the types
are extracted from the program, and deep learning is applied to predict types
from surface-level code properties that are statistically associated. The
foremost insight of our method is to constrain the predictions from the
learning procedure to respect the logical constraints, which we achieve by
relaxing the logical inference problem of type prediction into a continuous
optimisation problem. We build a tool called OptTyper to predict missing types
for TypeScript files. OptTyper combines a continuous interpretation of logical
constraints derived by classical static analysis of TypeScript code, with
natural constraints obtained from a deep learning model, which learns naming
conventions for types from a large codebase. By evaluating OptTyper, we show
that the combination of logical and natural constraints yields a large
improvement in performance over either kind of information individually and
achieves a 4% improvement over the state-of-the-art.
Related papers
- Learning Optimal Signal Temporal Logic Decision Trees for Classification: A Max-Flow MILP Formulation [5.924780594614676]
This paper presents a novel framework for inferring timed temporal logic properties from data.
We formulate the inference process as a mixed integer linear programming optimization problem.
Applying a max-flow algorithm on the resultant tree transforms the problem into a global optimization challenge.
We conduct three case studies involving two-class, multi-class, and complex formula classification scenarios.
arXiv Detail & Related papers (2024-07-30T16:56:21Z) - Learning Type Inference for Enhanced Dataflow Analysis [6.999203506253375]
We propose CodeTIDAL5, a Transformer-based model trained to reliably predict type annotations.
Our model outperforms the current state-of-the-art by 7.85% on the ManyTypes4TypeScript benchmark.
We present JoernTI, an integration of our approach into Joern, an open source static analysis tool.
arXiv Detail & Related papers (2023-10-01T13:52:28Z) - Generative Type Inference for Python [62.01560866916557]
This paper introduces TypeGen, a few-shot generative type inference approach that incorporates static domain knowledge from static analysis.
TypeGen creates chain-of-thought (COT) prompts by translating the type inference steps of static analysis into prompts based on the type dependency graphs (TDGs)
Experiments show that TypeGen outperforms the best baseline Type4Py by 10.0% for argument type prediction and 22.5% in return value type prediction in terms of top-1 Exact Match.
arXiv Detail & Related papers (2023-07-18T11:40:31Z) - TypeT5: Seq2seq Type Inference using Static Analysis [51.153089609654174]
We present a new type inference method that treats type prediction as a code infilling task.
Our method uses static analysis to construct dynamic contexts for each code element whose type signature is to be predicted by the model.
We also propose an iterative decoding scheme that incorporates previous type predictions in the model's input context.
arXiv Detail & Related papers (2023-03-16T23:48:00Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Query and Extract: Refining Event Extraction as Type-oriented Binary
Decoding [51.57864297948228]
We propose a novel event extraction framework that takes event types and argument roles as natural language queries.
Our framework benefits from the attention mechanisms to better capture the semantic correlation between the event types or argument roles and the input text.
arXiv Detail & Related papers (2021-10-14T15:49:40Z) - LambdaNet: Probabilistic Type Inference using Graph Neural Networks [46.66093127573704]
This paper proposes a probabilistic type inference scheme for TypeScript based on a graph neural network.
Our approach can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training.
arXiv Detail & Related papers (2020-04-29T17:48:40Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.