Related papers: Exploring Large Language Models for Analyzing and Improving Method Names in Scientific Code

Exploring Large Language Models for Analyzing and Improving Method Names in Scientific Code

URL: http://arxiv.org/abs/2507.16439v1
Date: Tue, 22 Jul 2025 10:33:49 GMT
Title: Exploring Large Language Models for Analyzing and Improving Method Names in Scientific Code
Authors: Gunnar Larsen, Carol Wong, Anthony Peruma,
Abstract summary: The recent advances in Large Language Models (LLMs) present new opportunities for automating code analysis tasks.<n>Our study evaluates four popular LLMs on their ability to analyze grammatical patterns and suggest improvements for 496 method names extracted from Python-based Jupyter Notebooks.
Score: 4.385741575933952
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Research scientists increasingly rely on implementing software to support their research. While previous research has examined the impact of identifier names on program comprehension in traditional programming environments, limited work has explored this area in scientific software, especially regarding the quality of method names in the code. The recent advances in Large Language Models (LLMs) present new opportunities for automating code analysis tasks, such as identifier name appraisals and recommendations. Our study evaluates four popular LLMs on their ability to analyze grammatical patterns and suggest improvements for 496 method names extracted from Python-based Jupyter Notebooks. Our findings show that the LLMs are somewhat effective in analyzing these method names and generally follow good naming practices, like starting method names with verbs. However, their inconsistent handling of domain-specific terminology and only moderate agreement with human annotations indicate that automated suggestions require human evaluation. This work provides foundational insights for improving the quality of scientific code through AI automation.

Related papers

MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks [56.34018316319873]
We propose MERA Code, a benchmark for evaluating code for the latest code generation LLMs in Russian.<n>This benchmark includes 11 evaluation tasks that span 8 programming languages.<n>We evaluate open LLMs and frontier API models, analyzing their limitations in terms of practical coding tasks in non-English languages.
arXiv Detail & Related papers (2025-07-16T14:31:33Z)
Method Names in Jupyter Notebooks: An Exploratory Study [5.8097100720874355]
We analyze the naming practices found in 691 methods across 384 Jupyter Notebooks.<n>Our findings reveal distinct characteristics of notebook method names, including a preference for conciseness.<n>We envision our findings contributing to developing specialized tools and techniques for evaluating and recommending high-quality names in scientific code.
arXiv Detail & Related papers (2025-04-29T00:38:56Z)
Exploring a Large Language Model for Transforming Taxonomic Data into OWL: Lessons Learned and Implications for Ontology Development [63.74965026095835]
This paper investigates the use of ChatGPT-4 to automate the development of the :Organism module in the Agricultural Product Types Ontology (APTO) for species classification.<n>Our methodology involved leveraging ChatGPT-4 to extract data from the GBIF Backbone API and generate files for further integration in APTO.<n>Two alternative approaches were explored: (1) issuing a series of prompts for ChatGPT-4 to execute tasks via the BrowserOP plugin and (2) directing ChatGPT-4 to design a Python algorithm to perform tasks.
arXiv Detail & Related papers (2025-04-25T19:05:52Z)
Evaluation of the Automated Labeling Method for Taxonomic Nomenclature Through Prompt-Optimized Large Language Model [0.0]
This study evaluates the feasibility of automatic species name labeling using large language model (LLM)<n>The results indicate that LLM-based classification achieved high accuracy in Morphology, Geography, and People categories.<n>Future research will focus on improving accuracy through optimized few-shot learning and retrieval-augmented generation techniques.
arXiv Detail & Related papers (2025-03-08T23:11:43Z)
LLM Program Optimization via Retrieval Augmented Search [71.40092732256252]
We propose a blackbox adaptation method called Retrieval Augmented Search (RAS) that performs beam search over candidate optimizations.<n>We show that RAS performs 1.8$times$ better than prior state-of-the-art blackbox adaptation strategies.<n>We also propose a method called AEGIS for improving interpretability by decomposing training examples into "atomic edits"
arXiv Detail & Related papers (2025-01-31T06:34:47Z)
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications.<n>We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.<n>We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z)
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains. This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources. We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z)
How are We Detecting Inconsistent Method Names? An Empirical Study from Code Review Perspective [13.585460827586926]
Proper naming of methods can make program code easier to understand, and thus enhance software maintainability. Much research effort has been invested into building automatic tools that can check for method name inconsistency. We present an empirical study on how state-of-the-art techniques perform in detecting or recommending consistent and inconsistent method names.
arXiv Detail & Related papers (2023-08-24T10:39:18Z)
How Does Naming Affect LLMs on Code Analysis Tasks? [8.150719423943109]
Large Language Models (LLMs) were proposed for natural language processing (NLP) and have shown promising results as general-purpose language models. This paper investigates how naming affects LLMs on code analysis tasks.
arXiv Detail & Related papers (2023-07-24T02:38:24Z)
Disambiguation of Company names via Deep Recurrent Networks [101.90357454833845]
We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings. We analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
arXiv Detail & Related papers (2023-03-07T15:07:57Z)
Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach [5.577102440028882]
We design a novel multi-task learning (MTL) approach for code summarization. We first introduce the tasks of generation and informativeness prediction of method names. A novel two-pass deliberation mechanism is then incorporated into our MTL architecture to generate more consistent intermediate states.
arXiv Detail & Related papers (2021-03-21T17:52:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.