Related papers: Method Names in Jupyter Notebooks: An Exploratory Study

Method Names in Jupyter Notebooks: An Exploratory Study

URL: http://arxiv.org/abs/2504.20330v1
Date: Tue, 29 Apr 2025 00:38:56 GMT
Title: Method Names in Jupyter Notebooks: An Exploratory Study
Authors: Carol Wong, Gunnar Larsen, Rocky Huang, Bonita Sharif, Anthony Peruma,
Abstract summary: We analyze the naming practices found in 691 methods across 384 Jupyter Notebooks.<n>Our findings reveal distinct characteristics of notebook method names, including a preference for conciseness.<n>We envision our findings contributing to developing specialized tools and techniques for evaluating and recommending high-quality names in scientific code.
Score: 5.8097100720874355
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Method names play an important role in communicating the purpose and behavior of their functionality. Research has shown that high-quality names significantly improve code comprehension and the overall maintainability of software. However, these studies primarily focus on naming practices in traditional software development. There is limited research on naming patterns in Jupyter Notebooks, a popular environment for scientific computing and data analysis. In this exploratory study, we analyze the naming practices found in 691 methods across 384 Jupyter Notebooks, focusing on three key aspects: naming style conventions, grammatical composition, and the use of abbreviations and acronyms. Our findings reveal distinct characteristics of notebook method names, including a preference for conciseness and deviations from traditional naming patterns. We identified 68 unique grammatical patterns, with only 55.57% of methods beginning with a verb. Further analysis revealed that half of the methods with return statements do not start with a verb. We also found that 30.39% of method names contain abbreviations or acronyms, representing mathematical or statistical terms and image processing concepts, among others. We envision our findings contributing to developing specialized tools and techniques for evaluating and recommending high-quality names in scientific code and creating educational resources tailored to the notebook development community.

Related papers

Exploring Large Language Models for Analyzing and Improving Method Names in Scientific Code [4.385741575933952]
The recent advances in Large Language Models (LLMs) present new opportunities for automating code analysis tasks.<n>Our study evaluates four popular LLMs on their ability to analyze grammatical patterns and suggest improvements for 496 method names extracted from Python-based Jupyter Notebooks.
arXiv Detail & Related papers (2025-07-22T10:33:49Z)
Recognition of Geometrical Shapes by Dictionary Learning [49.30082271910632]
We present a first approach to make dictionary learning work for shape recognition. The choice of the underlying optimization method has a significant impact on recognition quality. Experimental results confirm that dictionary learning may be an interesting method for shape recognition tasks.
arXiv Detail & Related papers (2025-04-15T08:05:16Z)
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding. There is no publicly available NLI corpus for the Romanian language. We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z)
Reproducing, Extending, and Analyzing Naming Experiments [0.23456696459191312]
A recent study on how developers choose names collected the names given by different developers for the same objects. This enabled a study of these names' diversity and structure, and the construction of a model of how names are created. We reproduce different parts of this study in three independent experiments.
arXiv Detail & Related papers (2024-02-15T15:39:54Z)
How are We Detecting Inconsistent Method Names? An Empirical Study from Code Review Perspective [13.585460827586926]
Proper naming of methods can make program code easier to understand, and thus enhance software maintainability. Much research effort has been invested into building automatic tools that can check for method name inconsistency. We present an empirical study on how state-of-the-art techniques perform in detecting or recommending consistent and inconsistent method names.
arXiv Detail & Related papers (2023-08-24T10:39:18Z)
Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection. Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training. This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z)
Disambiguation of Company names via Deep Recurrent Networks [101.90357454833845]
We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings. We analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
arXiv Detail & Related papers (2023-03-07T15:07:57Z)
Author Name Disambiguation via Heterogeneous Network Embedding from Structural and Semantic Perspectives [13.266320447769564]
Name ambiguity is common in academic digital libraries, such as multiple authors having the same name. The proposed method is mainly based on representation learning for heterogeneous networks and clustering. The semantic representation is generated using NLP tools.
arXiv Detail & Related papers (2022-12-24T11:22:34Z)
UCPhrase: Unsupervised Context-aware Quality Phrase Tagging [63.86606855524567]
UCPhrase is a novel unsupervised context-aware quality phrase tagger. We induce high-quality phrase spans as silver labels from consistently co-occurring word sequences. We show that our design is superior to state-of-the-art pre-trained, unsupervised, and distantly supervised methods.
arXiv Detail & Related papers (2021-05-28T19:44:24Z)
Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach [5.577102440028882]
We design a novel multi-task learning (MTL) approach for code summarization. We first introduce the tasks of generation and informativeness prediction of method names. A novel two-pass deliberation mechanism is then incorporated into our MTL architecture to generate more consistent intermediate states.
arXiv Detail & Related papers (2021-03-21T17:52:21Z)
Accelerating Text Mining Using Domain-Specific Stop Word Lists [57.76576681191192]
We present a novel approach for the automatic extraction of domain-specific words called the hyperplane-based approach. The hyperplane-based approach can significantly reduce text dimensionality by eliminating irrelevant features. Results indicate that the hyperplane-based approach can reduce the dimensionality of the corpus by 90% and outperforms mutual information.
arXiv Detail & Related papers (2020-11-18T17:42:32Z)
How Does That Sound? Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning [4.769747792846004]
SpokenName2Vec is a novel and generic approach which addresses the similar name suggestion problem. The proposed approach was demonstrated on a large-scale dataset consisting of 250,000 forenames. The performance of the proposed approach was found to be superior to 10 other algorithms evaluated in this study.
arXiv Detail & Related papers (2020-05-24T20:39:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.