RENAS: Prioritizing Co-Renaming Opportunities of Identifiers
- URL: http://arxiv.org/abs/2408.09716v2
- Date: Tue, 20 Aug 2024 15:08:10 GMT
- Title: RENAS: Prioritizing Co-Renaming Opportunities of Identifiers
- Authors: Naoki Doi, Yuki Osumi, Shinpei Hayashi,
- Abstract summary: This study introduces a technique called RENAS, which identifies and recommends related identifiers that should be renamed simultaneously in Java applications.
ReNAS determines priority scores for renaming candidates based on the relationships and similarities among identifiers.
ReNAS demonstrated an improvement in the F1-measure by more than 0.11 compared with existing renaming recommendation approaches.
- Score: 1.1688548469063846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Renaming identifiers in source code is a common refactoring task in software development. When renaming an identifier, other identifiers containing words with the same naming intention related to the renaming should be renamed simultaneously. However, identifying these related identifiers can be challenging. This study introduces a technique called RENAS, which identifies and recommends related identifiers that should be renamed simultaneously in Java applications. RENAS determines priority scores for renaming candidates based on the relationships and similarities among identifiers. Since identifiers that have a relationship and/or have similar vocabulary in the source code are often renamed together, their priority scores are determined based on these factors. Identifiers with higher priority are recommended to be renamed together. Through an evaluation involving real renaming instances extracted from change histories and validated manually, RENAS demonstrated an improvement in the F1-measure by more than 0.11 compared with existing renaming recommendation approaches.
Related papers
- Identifier Name Similarities: An Exploratory Study [3.7420775485568294]
We present our preliminary findings on the occurrence of identifier name similarity in software projects.<n>We envision our initial taxonomy providing researchers with a platform to analyze and evaluate the impact of identifier name similarity on code comprehension, maintainability, and collaboration among developers.
arXiv Detail & Related papers (2025-07-24T04:13:26Z) - On the Structure and Semantics of Identifier Names Containing Closed Syntactic Category Words [19.94735883254009]
This paper investigates the linguistic structure of identifier names by extending the concept of grammar patterns.<n>The specific focus is on closed syntactic categories, which are rarely studied in software engineering.<n>The relationship between closed-category grammar patterns and program behavior is then analyzed using grounded-theory-inspired coding, statistical, and pattern analysis.
arXiv Detail & Related papers (2025-05-24T00:58:50Z) - Reproducing, Extending, and Analyzing Naming Experiments [0.23456696459191312]
A recent study on how developers choose names collected the names given by different developers for the same objects.
This enabled a study of these names' diversity and structure, and the construction of a model of how names are created.
We reproduce different parts of this study in three independent experiments.
arXiv Detail & Related papers (2024-02-15T15:39:54Z) - RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename
Refactoring [57.8069006460087]
We study automatic rename on variable names, which is considered more challenging than other rename activities.
We propose RefBERT, a two-stage pre-trained framework for rename on variable names.
We show that the generated variable names of RefBERT are more accurate and meaningful than those produced by the existing method.
arXiv Detail & Related papers (2023-05-28T12:29:39Z) - Multiview Identifiers Enhanced Generative Retrieval [78.38443356800848]
generative retrieval generates identifier strings of passages as the retrieval target.
We propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage.
Our proposed approach performs the best in generative retrieval, demonstrating its effectiveness and robustness.
arXiv Detail & Related papers (2023-05-26T06:50:21Z) - Disambiguation of Company names via Deep Recurrent Networks [101.90357454833845]
We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings.
We analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
arXiv Detail & Related papers (2023-03-07T15:07:57Z) - Refining Pseudo Labels with Clustering Consensus over Generations for
Unsupervised Object Re-identification [84.72303377833732]
Unsupervised object re-identification targets at learning discriminative representations for object retrieval without any annotations.
We propose to estimate pseudo label similarities between consecutive training generations with clustering consensus and refine pseudo labels with temporally propagated and ensembled pseudo labels.
The proposed pseudo label refinery strategy is simple yet effective and can be seamlessly integrated into existing clustering-based unsupervised re-identification methods.
arXiv Detail & Related papers (2021-06-11T02:42:42Z) - Re-identification = Retrieval + Verification: Back to Essence and
Forward with a New Metric [88.96593495602923]
We propose Genuine Open-set re-ID Metric (GOM) as a new re-identification metric.
GOM balances the effect of performing retrieval and verification into a single unified metric.
GOM scores excellent in aligning with human visual evaluation of re-ID performance.
arXiv Detail & Related papers (2020-11-23T16:11:19Z) - OCoR: An Overlapping-Aware Code Retriever [15.531119719750807]
Given a natural language description, code retrieval aims to search for the most relevant code among a set of code.
Existing state-of-the-art approaches apply neural networks to code retrieval.
We propose a novel neural architecture named OCoR, where we introduce two specifically-designed components to capture overlaps.
arXiv Detail & Related papers (2020-08-12T09:43:35Z) - Interpretability Analysis for Named Entity Recognition to Understand
System Predictions and How They Can Improve [49.878051587667244]
We examine the performance of several variants of LSTM-CRF architectures for named entity recognition.
We find that context representations do contribute to system performance, but that the main factor driving high performance is learning the name tokens themselves.
We enlist human annotators to evaluate the feasibility of inferring entity types from the context alone and find that, while people are not able to infer the entity type either for the majority of the errors made by the context-only system, there is some room for improvement.
arXiv Detail & Related papers (2020-04-09T14:37:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.