Related papers: Type4Py: Deep Similarity Learning-Based Type Inference for Python

Type4Py: Deep Similarity Learning-Based Type Inference for Python

URL: http://arxiv.org/abs/2101.04470v1
Date: Tue, 12 Jan 2021 13:32:53 GMT
Title: Type4Py: Deep Similarity Learning-Based Type Inference for Python
Authors: Amir M. Mir, Evaldas Latoskinas, Sebastian Proksch, Georgios Gousios
Abstract summary: We present Type4Py, a deep similarity learning-based type inference model for Python. We design a hierarchical neural network model that learns to discriminate between types of the same kind and dissimilar types in a high-dimensional space. Considering the Top-1 prediction, Type4Py obtains 19.33% and 13.49% higher precision than Typilus and TypeWriter, respectively.
Score: 9.956021565144662
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility. While this allegedly enables greater productivity, lack of static typing can cause runtime exceptions, type inconsistencies, and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing codebases is error-prone and laborious, learning-based approaches have been proposed to enable automatic type annotations based on existing, partially annotated codebases. However, the prediction of rare and user-defined types is still challenging. In this paper, we present Type4Py, a deep similarity learning-based type inference model for Python. We design a hierarchical neural network model that learns to discriminate between types of the same kind and dissimilar types in a high-dimensional space, which results in clusters of types. Nearest neighbor search suggests likely type signatures of given Python functions. The types visible to analyzed modules are surfaced using lightweight dependency analysis. The results of quantitative and qualitative evaluation indicate that Type4Py significantly outperforms state-of-the-art approaches at the type prediction task. Considering the Top-1 prediction, Type4Py obtains 19.33% and 13.49% higher precision than Typilus and TypeWriter, respectively, while utilizing a much bigger vocabulary.

Related papers

Type-Constrained Code Generation with Language Models [51.03439021895432]
Large language models (LLMs) produce uncompilable output because their next-token inference procedure does not model formal aspects of code. We introduce a type-constrained decoding approach that leverages type systems to guide code generation. Our approach reduces compilation errors by more than half and increases functional correctness in code synthesis, translation, and repair tasks.
arXiv Detail & Related papers (2025-04-12T15:03:00Z)
Toward a Corpus Study of the Dynamic Gradual Type [0.0]
This paper reports on an in-progress corpus study of the dynamic type in Python, targeting 221 GitHub projects that use the mypy type checker. The study reveals eight patterns-of-use for the dynamic type, which have implications for future refinements of the mypy type system and for tool support to encourage precise type annotations.
arXiv Detail & Related papers (2025-03-11T22:18:51Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Generative Type Inference for Python [62.01560866916557]
This paper introduces TypeGen, a few-shot generative type inference approach that incorporates static domain knowledge from static analysis. TypeGen creates chain-of-thought (COT) prompts by translating the type inference steps of static analysis into prompts based on the type dependency graphs (TDGs) Experiments show that TypeGen outperforms the best baseline Type4Py by 10.0% for argument type prediction and 22.5% in return value type prediction in terms of top-1 Exact Match.
arXiv Detail & Related papers (2023-07-18T11:40:31Z)
Type Prediction With Program Decomposition and Fill-in-the-Type Training [2.7998963147546143]
We build OpenTau, a search-based approach for type prediction that leverages large language models. We evaluate our work with a new dataset for TypeScript type prediction, and show that 47.4% of files type check (14.5% absolute improvement) with an overall rate of 3.3 type errors per file.
arXiv Detail & Related papers (2023-05-25T21:16:09Z)
TypeT5: Seq2seq Type Inference using Static Analysis [51.153089609654174]
We present a new type inference method that treats type prediction as a code infilling task. Our method uses static analysis to construct dynamic contexts for each code element whose type signature is to be predicted by the model. We also propose an iterative decoding scheme that incorporates previous type predictions in the model's input context.
arXiv Detail & Related papers (2023-03-16T23:48:00Z)
Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field [47.22366788848256]
We use an undirected graphical model called pairwise conditional random field (PCRF) to formulate the UFET problem. We use various modern backbones for entity typing to compute unary potentials and derive pairwise potentials from type phrase representations. We use mean-field variational inference for efficient type inference on very large type sets and unfold it as a neural network module to enable end-to-end training.
arXiv Detail & Related papers (2022-12-03T09:49:15Z)
SIGTYP 2020 Shared Task: Prediction of Typological Features [78.95376120154083]
A major drawback hampering broader adoption of typological KBs is that they are sparsely populated. As typological features often correlate with one another, it is possible to predict them and thus automatically populate typological KBs. Overall, the task attracted 8 submissions from 5 teams, out of which the most successful methods make use of such feature correlations.
arXiv Detail & Related papers (2020-10-16T08:47:24Z)
LambdaNet: Probabilistic Type Inference using Graph Neural Networks [46.66093127573704]
This paper proposes a probabilistic type inference scheme for TypeScript based on a graph neural network. Our approach can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training.
arXiv Detail & Related papers (2020-04-29T17:48:40Z)
Typilus: Neural Type Hints [17.332608142043004]
We present a graph neural network model that predicts types by probabilistically reasoning over a program's structure, names, and patterns. Our model can employ one-shot learning to predict an open vocabulary of types, including rare and user-defined ones. We show that Typilus confidently predicts types for 70% of all annotatable symbols.
arXiv Detail & Related papers (2020-04-06T11:14:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.