Fugu-MT 論文翻訳(概要): Build It Clean: Large-Scale Detection of Code Smells in Build Scripts

論文の概要: Build It Clean: Large-Scale Detection of Code Smells in Build Scripts

arxiv url: http://arxiv.org/abs/2506.17948v1
Date: Sun, 22 Jun 2025 09:01:42 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-24 19:06:36.66504
Title: Build It Clean: Large-Scale Detection of Code Smells in Build Scripts
Title（参考訳）: ビルドをクリーンに - ビルドスクリプトでコードスメルを大規模に検出する
Authors: Mahzabin Tamanna, Yash Chandrani, Matthew Burrows, Brandon Wroblewski, Laurie Williams, Dominik Wermke,
Abstract要約: セキュアでないURLはMavenビルドスクリプトで最も一般的なコードの臭いで、ハードコードパス/URLはGradleとCMakeスクリプトの両方でよく見られた。私たちの発見に基づいて、ソフトウェアプロジェクトの効率性、信頼性、保守性を改善するために、ビルドスクリプトにおけるコードの臭いを緩和する戦略を推奨します。
参考スコア（独自算出の注目度）: 7.553917924788079
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Build scripts are files that automate the process of compiling source code, managing dependencies, running tests, and packaging software into deployable artifacts. These scripts are ubiquitous in modern software development pipelines for streamlining testing and delivery. While developing build scripts, practitioners may inadvertently introduce code smells. Code smells are recurring patterns of poor coding practices that may lead to build failures or increase risk and technical debt. The goal of this study is to aid practitioners in avoiding code smells in build scripts through an empirical study of build scripts and issues on GitHub. We employed a mixed-methods approach, combining qualitative and quantitative analysis. We conducted a qualitative analysis of 2000 build-script-related GitHub issues. Next, we developed a static analysis tool, Sniffer, to identify code smells in 5882 build scripts of Maven, Gradle, CMake, and Make files, collected from 4877 open-source GitHub repositories. We identified 13 code smell categories, with a total of 10,895 smell occurrences, where 3184 were in Maven, 1214 in Gradle, 337 in CMake, and 6160 in Makefiles. Our analysis revealed that Insecure URLs were the most prevalent code smell in Maven build scripts, while Hardcoded Paths/URLs were commonly observed in both Gradle and CMake scripts. Wildcard Usage emerged as the most frequent smell in Makefiles. The co-occurrence analysis revealed strong associations between specific smell pairs of Hardcoded Paths/URLs with Duplicates, and Inconsistent Dependency Management with Empty or Incomplete Tags, indicating potential underlying issues in the build script structure and maintenance practices. Based on our findings, we recommend strategies to mitigate the existence of code smells in build scripts to improve the efficiency, reliability, and maintainability of software projects.
Abstract（参考訳）: ビルドスクリプトは、ソースコードのコンパイル、依存関係の管理、テストの実行、ソフトウェアをデプロイ可能なアーティファクトにパッケージングするプロセスを自動化するファイルである。これらのスクリプトは、テストとデリバリの合理化のために、現代のソフトウェア開発パイプラインでユビキタスです。ビルドスクリプトを開発している間、実践者は必然的にコードの臭いを導入します。コードの臭いは、悪いコーディングプラクティスの繰り返しパターンであり、ビルドの失敗やリスクや技術的負債の増加につながる可能性がある。この研究の目的は、GitHubでビルドスクリプトと問題に関する実証的研究を通じて、ビルドスクリプトのコードの臭いを避けるための実践者を支援することである。定性分析と定量分析を組み合わせた混合手法を用いた。われわれは2000のビルドスクリプト関連GitHub問題の質的分析を行った。次に、静的解析ツールであるSnifferを開発し、オープンソースのGitHubリポジトリ4877から収集されたMaven、Gradle、CMake、Makeファイルの5882のビルドスクリプトで、コードの臭いを識別した。計10,895件の臭いが発生し,Mavenが3184件,Gradleが1214件,CMakeが337件,Makefilesが6160件であった。 Insecure URLはMavenビルドスクリプトで最も一般的なコードの臭いで、Hardcoded Paths/URLはGradleスクリプトとCMakeスクリプトの両方でよく見られました。ワイルドカードの使用は、Makefilesで最も頻繁に見られる臭いとして現れた。共起分析では、Duplicatesとハードコードパス/URLの特定の臭いペアと、Emptyや不完全なタグによる一貫性のない依存性管理との間に強い関連性があることが判明した。私たちの発見に基づいて、ソフトウェアプロジェクトの効率性、信頼性、保守性を改善するために、ビルドスクリプトにおけるコードの臭いを緩和する戦略を推奨します。

関連論文リスト

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)。従来の静的ベンチマークとは異なり、SwingArenaはLLMをイテレーションとして組み合わせて、テストケースを作成し、継続的インテグレーション(CI)パイプラインを通じてパッチを検証するパッチとレビュアーを生成することで、ソフトウェアのコラボレーションプロセスをモデル化する。
論文参考訳（メタデータ） (2025-05-29T18:28:02Z)
Attestable builds: compiling verifiable binaries on untrusted systems using trusted execution environments [3.207381224848367]
attestableビルドは、ソフトウェアアーティファクトに強力なソース対バイナリ対応を提供する。私たちは、ソースコードと最終バイナリアーティファクトの間の信頼を切断する不透明なビルドパイプラインの課題に取り組みます。
論文参考訳（メタデータ） (2025-05-05T10:00:04Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
機械学習論文を機能コードリポジトリに変換するフレームワークであるPaperCoderを紹介した。 PaperCoderは3つの段階で動作する。計画、図によるシステムアーキテクチャの設計、ファイル依存の特定、構成ファイルの生成である。次に、モデルベースおよび人的評価の両方に基づいて、機械学習論文からコード実装を生成するPaperCoderを評価する。
論文参考訳（メタデータ） (2025-04-24T01:57:01Z)
CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification [71.34070740261072]
本稿では,テストケースの生成と完成におけるモデルの能力を評価するためのベンチマークCLOVERを提案する。ベンチマークはタスク間でのコード実行のためにコンテナ化されています。
論文参考訳（メタデータ） (2025-02-12T21:42:56Z)
An Empirical Study of Dotfiles Repositories Containing User-Specific Configuration Files [1.7556600627464058]
数十万がGitHubにリポジトリを公開している。 GitHubで公開ホストされているdotfilesリポジトリを収集、分析しました。トップ500のGitHubユーザのうち25.8%が、何らかの形で公開アクセス可能なdotfilesリポジトリを維持していることがわかった。
論文参考訳（メタデータ） (2025-01-30T18:32:46Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
GitHub上のJavaオープンソースプロジェクトからの2,401のコードレビューコメントを分析した。改善提案の83.9%が承認され、統合され、1%未満が後に復活した。
論文参考訳（メタデータ） (2024-10-29T12:21:23Z)
MoCo: Fuzzing Deep Learning Libraries via Assembling Code [13.937180393991616]
ディープラーニング技術は様々なアプリケーションシナリオを持つソフトウェアシステムに応用されている。 DLライブラリはDLシステムの基盤として機能し、その中のバグは予測不可能な影響をもたらす可能性がある。そこで本研究では,組立コードによるDLライブラリのファジングテスト手法であるMoCoを提案する。
論文参考訳（メタデータ） (2024-05-13T13:40:55Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoderはリポジトリレベルのコード補完プロセスを合理化するフレームワークである。類似性ベースのレトリバーと、事前訓練されたコード言語モデルが組み込まれている。バニラ検索で拡張されたコード補完アプローチよりも一貫して優れています。
論文参考訳（メタデータ） (2023-03-22T13:54:46Z)
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context [82.88371379927112]
予め訓練されたコード LM 上で,ファイル内コンテキストとファイル内コンテキストを協調的に学習するための,クロスファイルコンテキストを組み込んだフレームワークを提案する。 CoCoMICは既存のコードLMを33.94%の精度で改善し、クロスファイルコンテキストが提供されるとコード補完のための識別子マッチングが28.69%増加した。
論文参考訳（メタデータ） (2022-12-20T05:48:09Z)
Empirical Analysis on Effectiveness of NLP Methods for Predicting Code Smell [3.2973778921083357]
コードの臭いは、システムに固有の問題の表面的な指標である。 629パッケージ上に3つのExtreme機械学習マシンカーネルを使用して、8つのコードの臭いを識別します。以上の結果から,放射基底関数型カーネルは,平均98.52の精度で3つのカーネル法のうち最高の性能を発揮することが示唆された。
論文参考訳（メタデータ） (2021-08-08T12:10:20Z)
The Prevalence of Code Smells in Machine Learning projects [9.722159563454436]
静的コード解析は、ソースコードの潜在的な欠陥、機会、共通のコーディング標準の違反を見つけるのに使うことができる。 74のオープンソースプロジェクトのデータセットを集め、依存関係をインストールしてPylintを実行しました。その結果、検出されたすべてのコードの臭いのトップ20に到達した。
論文参考訳（メタデータ） (2021-03-06T16:01:54Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。