Fugu-MT 論文翻訳(概要): Forecasting the Maintained Score from the OpenSSF Scorecard for GitHub Repositories linked to PyPI libraries

論文の概要: Forecasting the Maintained Score from the OpenSSF Scorecard for GitHub Repositories linked to PyPI libraries

arxiv url: http://arxiv.org/abs/2601.18344v1
Date: Mon, 26 Jan 2026 10:32:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-27 15:23:08.779947
Title: Forecasting the Maintained Score from the OpenSSF Scorecard for GitHub Repositories linked to PyPI libraries
Title（参考訳）: PyPIライブラリにリンクされたGitHubリポジトリ用のOpenSSF Scorecardからのメンテナンススコアの予測
Authors: Alexandros Tsakpinis, Efe Berk Ergülec, Emil Schwenger, Alexander Pretschner,
Abstract要約: 今後,OpenSSFが保持するスコアをどの程度予測できるかを検討する。 PageRankによって、最も中央のPyPIライブラリのトップ1%に関連付けられた3,220のGitHubリポジトリを分析します。以上の結果から,今後の保守活動は有意義な精度で予測できることが示唆された。
参考スコア（独自算出の注目度）: 78.48200143057376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The OpenSSF Scorecard is widely used to assess the security posture of open-source software repositories, with the Maintained metric indicating recent development activity and helping identify potentially abandoned dependencies. However, this metric is inherently retrospective, reflecting only the past 90 days of activity and providing no insight into future maintenance, which limits its usefulness for proactive risk assessment. In this paper, we study to what extent future maintenance activity, as captured by the OpenSSF Maintained score, can be forecasted. We analyze 3,220 GitHub repositories associated with the top 1% most central PyPI libraries by PageRank and reconstruct historical Maintained scores over a three-year period. We formulate the task as multivariate time series forecasting and consider four target representations: raw scores, bucketed maintenance levels, numerical trend slopes, and categorical trend types. We compare a statistical model (VARMA), a machine learning model (Random Forest), and a deep learning model (LSTM) across training windows of 3-12 months and forecasting horizons of 1-6 months. Our results show that future maintenance activity can be predicted with meaningful accuracy, particularly for aggregated representations such as bucketed scores and trend types, achieving accuracies above 0.95 and 0.80, respectively. Simpler statistical and machine learning models perform on par with deep learning approaches, indicating that complex architectures are not required. These findings suggest that predictive modeling can effectively complement existing Scorecard metrics, enabling more proactive assessment of open-source maintenance risks.
Abstract（参考訳）: OpenSSF Scorecardは、オープンソースソフトウェアリポジトリのセキュリティ姿勢を評価するために広く使用されている。しかし、この指標は本質的には振り返りであり、過去90日間の活動しか反映せず、将来のメンテナンスについての洞察を与えていない。本稿では,OpenSSF の維持管理スコアによって得られた今後の保守活動の予測方法について検討する。 PageRankによって、最も中央のPyPIライブラリのトップ1%に関連する3,220のGitHubリポジトリを分析し、3年間の履歴管理スコアを再構築します。タスクを多変量時系列予測として定式化し、生スコア、バケット付きメンテナンスレベル、数値トレンドスロープ、カテゴリートレンドタイプという4つの目標表現を考察する。我々は,3～12ヶ月のトレーニングウィンドウと1～6ヶ月の地平線をまたいだ統計モデル(VARMA),機械学習モデル(Random Forest),ディープラーニングモデル(LSTM)を比較した。以上の結果から,今後の保守活動は有意な精度で予測できること,特にバケットスコアやトレンドタイプなどの集約表現では0.95以上,0.80以上となることが示唆された。より単純な統計的および機械学習モデルは、ディープラーニングアプローチと同等に機能し、複雑なアーキテクチャは必要ないことを示している。これらの結果は、予測モデリングが既存のScorecardメトリクスを効果的に補完し、オープンソースのメンテナンスリスクのより積極的な評価を可能にすることを示唆している。

論文の概要: Forecasting the Maintained Score from the OpenSSF Scorecard for GitHub Repositories linked to PyPI libraries

関連論文リスト