Fugu-MT 論文翻訳(概要): Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice

論文の概要: Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice

arxiv url: http://arxiv.org/abs/2305.11556v1
Date: Fri, 19 May 2023 09:55:48 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-24 08:14:34.187327
Title: Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice
Title（参考訳）: DNN4SE実験における落とし穴 : 実践状況の分析
Authors: Sira Vegas, Sebastian Elbaum
Abstract要約: 我々は、ソフトウェアエンジニアリングのプレミア会場で発行された55の論文に現れるディープニューラルネットワークに依存する技術を用いて、194の実験を行い、マッピング研究を実施します。以上の結果から,ACMアーティファクトバッジを受信した者を含む実験の大部分が,その信頼性に疑問を呈する根本的な限界があることが判明した。
参考スコア（独自算出の注目度）: 0.7614628596146599
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Software engineering techniques are increasingly relying on deep learning approaches to support many software engineering tasks, from bug triaging to code generation. To assess the efficacy of such techniques researchers typically perform controlled experiments. Conducting these experiments, however, is particularly challenging given the complexity of the space of variables involved, from specialized and intricate architectures and algorithms to a large number of training hyper-parameters and choices of evolving datasets, all compounded by how rapidly the machine learning technology is advancing, and the inherent sources of randomness in the training process. In this work we conduct a mapping study, examining 194 experiments with techniques that rely on deep neural networks appearing in 55 papers published in premier software engineering venues to provide a characterization of the state-of-the-practice, pinpointing experiments common trends and pitfalls. Our study reveals that most of the experiments, including those that have received ACM artifact badges, have fundamental limitations that raise doubts about the reliability of their findings. More specifically, we find: weak analyses to determine that there is a true relationship between independent and dependent variables (87% of the experiments); limited control over the space of DNN relevant variables, which can render a relationship between dependent variables and treatments that may not be causal but rather correlational (100% of the experiments); and lack of specificity in terms of what are the DNN variables and their values utilized in the experiments (86% of the experiments) to define the treatments being applied, which makes it unclear whether the techniques designed are the ones being assessed, or how the sources of extraneous variation are controlled. We provide some practical recommendations to address these limitations.
Abstract（参考訳）: ソフトウェアエンジニアリングのテクニックは、バグトリージングからコード生成まで、多くのソフトウェアエンジニアリングタスクをサポートするためのディープラーニングのアプローチにますます依存しています。このような手法の有効性を評価するために、研究者は一般に制御された実験を行う。しかし、これらの実験を行うことは、専門的で複雑なアーキテクチャやアルゴリズムから多数のトレーニングハイパーパラメータ、進化するデータセットの選択に至るまで、関連する変数の空間の複雑さを考えると、特に困難である。本研究は,ソフトウェア工学の最初期の会場で発行された55の論文に現れるディープニューラルネットワークに依存する技術を用いて,194の実験を地図化して実施し,一般的なトレンドと落とし穴を指摘する。以上の結果から,ACMアーティファクトバッジを受け取った者を含む実験の大部分が,その信頼性に疑問を呈する根本的な限界があることが判明した。 More specifically, we find: weak analyses to determine that there is a true relationship between independent and dependent variables (87% of the experiments); limited control over the space of DNN relevant variables, which can render a relationship between dependent variables and treatments that may not be causal but rather correlational (100% of the experiments); and lack of specificity in terms of what are the DNN variables and their values utilized in the experiments (86% of the experiments) to define the treatments being applied, which makes it unclear whether the techniques designed are the ones being assessed, or how the sources of extraneous variation are controlled. これらの制限に対処するための実用的な推奨事項をいくつか提示します。

関連論文リスト

MLXP: A Framework for Conducting Replicable Experiments in Python [63.37350735954699]
MLXPはPythonをベースとした,オープンソースの,シンプルで,軽量な実験管理ツールである。実験プロセスを最小限のオーバーヘッドで合理化し、高いレベルの実践的オーバーヘッドを確保します。
論文参考訳（メタデータ） (2024-02-21T14:22:20Z)
Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
RCTとは異なり、間接的な実験は条件付き機器変数を利用して治療効果を推定する。本稿では,データ収集ポリシーを適応的に設計することで,間接実験におけるサンプル効率の向上に向けた最初のステップについて述べる。我々の主な貢献は、影響関数を利用して最適なデータ収集ポリシーを探索する実用的な計算手順である。
論文参考訳（メタデータ） (2023-12-05T02:38:04Z)
Machine learning enabled experimental design and parameter estimation for ultrafast spin dynamics [54.172707311728885]
機械学習とベイズ最適実験設計(BOED)を組み合わせた方法論を提案する。本手法は,大規模スピンダイナミクスシミュレーションのためのニューラルネットワークモデルを用いて,BOEDの正確な分布と実用計算を行う。数値ベンチマークでは,XPFS実験の誘導,モデルパラメータの予測,実験時間内でのより情報的な測定を行う上で,本手法の優れた性能を示す。
論文参考訳（メタデータ） (2023-06-03T06:19:20Z)
Experts in the Loop: Conditional Variable Selection for Accelerating Post-Silicon Analysis Based on Deep Learning [6.6357750579293935]
シリコン後検証は半導体製造において最も重要なプロセスの1つである。この研究は、専門家をループに留めつつ、新しい条件変数選択アプローチを設計することを目的としている。
論文参考訳（メタデータ） (2022-09-30T06:12:12Z)
Lessons Learned from Data-Driven Building Control Experiments: Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement Learning [0.0]
この写本は、多くの近代的なデータ駆動技術に関する実験主義者の視点を提供する。データ要件、使いやすさ、計算負担、実世界のアプリケーションのコンテキストにおける堅牢性の観点から比較される。
論文参考訳（メタデータ） (2022-05-31T11:40:22Z)
Do Deep Neural Networks Always Perform Better When Eating More Data? [82.6459747000664]
Identically Independent Distribution(IID)とOut of Distribution(OOD)による実験を設計する。 IID条件下では、情報の量は各サンプルの効果度、サンプルの寄与度、クラス間の差がクラス情報の量を決定する。 OOD条件下では、試料のクロスドメイン度が寄与を決定づけ、無関係元素によるバイアス適合はクロスドメインの重要な要素である。
論文参考訳（メタデータ） (2022-05-30T15:40:33Z)
Reinforcement Learning based Sequential Batch-sampling for Bayesian Optimal Experimental Design [1.6249267147413522]
実験の逐次設計(SDOE)は,近年,有望な結果をもたらす手法として人気がある。本研究では、SDOE戦略を拡張し、実験やコンピュータコードに一連の入力で問い合わせる。提案手法のユニークな機能は、複数のタスクに適用できる能力である。
論文参考訳（メタデータ） (2021-12-21T02:25:23Z)
Constrained multi-objective optimization of process design parameters in settings with scarce data: an application to adhesive bonding [48.7576911714538]
接着プロセスに最適なプロセスパラメータを見つけることは困難である。遺伝的アルゴリズムのような伝統的な進化的アプローチは、その問題を解決するのに不適である。本研究では,目的関数と制約関数をエミュレートするために,特定の機械学習手法をうまく応用した。
論文参考訳（メタデータ） (2021-12-16T10:14:39Z)
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data [83.50281440043241]
時系列データから不均一な処理効果を推定する問題について検討する。本稿では,バランス表現に基づく治療特異的ハザード推定のための新しいディープラーニング手法を提案する。
論文参考訳（メタデータ） (2021-10-26T20:13:17Z)
Autonomous Materials Discovery Driven by Gaussian Process Regression with Inhomogeneous Measurement Noise and Anisotropic Kernels [1.976226676686868]
実験分野の大半は、新しい科学的発見を探すために、大規模で高次元のパラメータ空間を探索するという課題に直面している。近年の進歩により、探査プロセスの自動化が進み、材料発見の効率が向上した。ガンマプロセス回帰(GPR)技術は多くの種類の実験を操る方法として登場した。
論文参考訳（メタデータ） (2020-06-03T19:18:47Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。