Related papers: Integrating Code Metrics into Automated Documentation Generation for Computational Notebooks

Integrating Code Metrics into Automated Documentation Generation for Computational Notebooks

URL: http://arxiv.org/abs/2602.08133v1
Date: Sun, 08 Feb 2026 21:40:57 GMT
Title: Integrating Code Metrics into Automated Documentation Generation for Computational Notebooks
Authors: Mojtaba Mostafavi Ghahfarokhi, Hamed Jahantigh, Alireza Asadi, Abbas Heydarnoori,
Abstract summary: This paper investigates the role of source code metrics as auxiliary signals for automated documentation generation.<n>It focuses on computational notebooks, a popular medium among data scientists that integrates code, narrative, and results but suffers from inconsistent documentation.<n>Results show that incorporating code metrics improves the accuracy and contextual relevance of generated documentation.
Score: 0.18665975431697424
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Effective code documentation is essential for collaboration, comprehension, and long-term software maintainability, yet developers often neglect it due to its repetitive nature. Automated documentation generation has evolved from heuristic and rule-based methods to neural network-based and large language model (LLM)-based approaches. However, existing methods often overlook structural and quantitative characteristics of code that influence readability and comprehension. Prior research suggests that code metrics capture information relevant to program understanding. Building on these insights, this paper investigates the role of source code metrics as auxiliary signals for automated documentation generation, focusing on computational notebooks, a popular medium among data scientists that integrates code, narrative, and results but suffers from inconsistent documentation. We propose a two-stage approach. First, the CodeSearchNet dataset construction process was refined to create a specialized dataset from over 17 million code and markdown cells. After structural and semantic filtering, approximately 36,734 high-quality (code, markdown) pairs were extracted. Second, two modeling paradigms, a lightweight CNN-RNN architecture and a few-shot GPT-3.5 architecture, were evaluated with and without metric information. Results show that incorporating code metrics improves the accuracy and contextual relevance of generated documentation, yielding gains of 6% in BLEU-1 and 3% in ROUGE-L F1 for CNN-RNN-based architecture, and 9% in BERTScore F1 for LLM-based architecture. These findings demonstrate that integrating code metrics provides valuable structural context, enhancing automated documentation generation across diverse model families.

Related papers

SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering [8.235446273226277]
Traceability between embedded systemss and their corresponding code implementations is a fundamental challenge in systems engineering.<n>Existing Traceability Link Recovery approaches rely on lexical similarity and information retrieval techniques.<n>We present a hierarchical-to-code mapping methodology that employs large language models for semantic analysis.
arXiv Detail & Related papers (2026-01-16T11:50:18Z)
UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters [55.34921520578968]
Vision-language models (VLMs) have achieved impressive unified recognition of text and formulas.<n>We propose UniRec-0.1B, a unified recognition model with only 0.1B parameters.<n>It is capable of performing text and formula recognition at multiple levels, including characters, words, lines, paragraphs, and documents.
arXiv Detail & Related papers (2025-12-24T10:35:21Z)
Cross-modal Retrieval Models for Stripped Binary Analysis [62.89251403093734]
BinSeek is the first two-stage cross-modal retrieval framework for stripped binary code analysis.<n>It consists of two models: BinSeekEmbedding is trained on large-scale dataset to learn the semantic relevance of the binary code.<n>BinSeek-Reranker learns to carefully judge the relevance of the candidate code to the description with context augmentation.
arXiv Detail & Related papers (2025-12-11T07:58:10Z)
CodeWiki: Evaluating AI's Ability to Generate Holistic Documentation for Large-Scale Codebases [7.75137961900221]
We present bftextCodeWiki, a unified framework for automated repository-level documentation across seven programming languages.<n>CodeWiki introduces three key innovations: (i) hierarchical decomposition that preserves architectural context across multiple levels of granularity, (ii) recursive multi-agent processing with dynamic task delegation for scalable generation, and (iii) multi-modal synthesis that integrates textual descriptions with visual artifacts such as architecture diagrams and data-flow representations.<n>CodeWiki achieves a 68.79% quality score with proprietary models, outperforming the closed-source DeepWiki baseline (64.06%) by 4.73%
arXiv Detail & Related papers (2025-10-28T13:52:46Z)
Evaluating the Use of LLMs for Documentation to Code Traceability [3.076436880934678]
Large Language Models can establish trace links between various software documentation and source code.<n>We create two novel datasets from two open-source projects (Unity Catalog and Crawl4AI)<n>Results show that the best-performing LLM achieves F1-scores of 79.4% and 80.4% across the two datasets.
arXiv Detail & Related papers (2025-06-19T16:18:53Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [70.04746094652653]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.<n>PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.<n>We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
Contextualized Data-Wrangling Code Generation in Computational Notebooks [131.26365849822932]
We propose an automated approach, CoCoMine, to mine data-wrangling code generation examples with clear multi-modal contextual dependency. We construct CoCoNote, a dataset containing 58,221 examples for Contextualized Data-wrangling Code generation in Notebooks. Experiment results demonstrate the significance of incorporating data context in data-wrangling code generation.
arXiv Detail & Related papers (2024-09-20T14:49:51Z)
Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning [1.8270184406083445]
We explore using large language models (LLM) and prompting strategies to automatically extract dimensions from documents. Our approach could aid data publishers and practitioners in creating machine-readable documentation. We have released an open-source tool implementing our approach and a replication package, including the experiments' code and results.
arXiv Detail & Related papers (2024-04-04T10:09:28Z)
Leveraging Generative AI: Improving Software Metadata Classification with Generated Code-Comment Pairs [0.0]
In software development, code comments play a crucial role in enhancing code comprehension and collaboration. This research paper addresses the challenge of objectively classifying code comments as "Useful" or "Not Useful" We propose a novel solution that harnesses contextualized embeddings, particularly BERT, to automate this classification process.
arXiv Detail & Related papers (2023-10-14T12:09:43Z)
Autoregressive Search Engines: Generating Substrings as Document Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers. Previous work has explored ways to partition the search space into hierarchical structures. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
Improved Code Summarization via a Graph Neural Network [96.03715569092523]
In general, source code summarization techniques use the source code as input and outputs a natural language description. We present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries.
arXiv Detail & Related papers (2020-04-06T17:36:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.