Fugu-MT 論文翻訳(概要): On the utility of feature selection in building two-tier decision trees

論文の概要: On the utility of feature selection in building two-tier decision trees

arxiv url: http://arxiv.org/abs/2212.14448v1
Date: Thu, 29 Dec 2022 20:10:45 GMT
ステータス: 翻訳完了
システム内更新日: 2023-01-02 15:38:50.289387
Title: On the utility of feature selection in building two-tier decision trees
Title（参考訳）: 二層決定木構築における特徴選択の有用性について
Authors: Sergey A. Saltykov
Abstract要約: 2層決定木構築における相補的特徴の相補的効果は,他の特徴によって阻害されることが示されている。干渉機能を取り除いたり、取り除いたりすることで、最大24倍の性能を向上させることができる。これにより、データや計算資源が十分である場合に、機能選択手法の範囲を広げることができると結論付けている。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Nowadays, feature selection is frequently used in machine learning when there is a risk of performance degradation due to overfitting or when computational resources are limited. During the feature selection process, the subset of features that are most relevant and least redundant is chosen. In recent years, it has become clear that, in addition to relevance and redundancy, features' complementarity must be considered. Informally, if the features are weak predictors of the target variable separately and strong predictors when combined, then they are complementary. It is demonstrated in this paper that the synergistic effect of complementary features mutually amplifying each other in the construction of two-tier decision trees can be interfered with by another feature, resulting in a decrease in performance. It is demonstrated using cross-validation on both synthetic and real datasets, regression and classification, that removing or eliminating the interfering feature can improve performance by up to 24 times. It has also been discovered that the lesser the domain is learned, the greater the increase in performance. More formally, it is demonstrated that there is a statistically significant negative rank correlation between performance on the dataset prior to the elimination of the interfering feature and performance growth after the elimination of the interfering feature. It is concluded that this broadens the scope of feature selection methods for cases where data and computational resources are sufficient.
Abstract（参考訳）: 現在、機能選択は、オーバーフィットによるパフォーマンス低下のリスクがある場合や計算資源が限られている場合、機械学習で頻繁に使われている。機能選択プロセスでは、最も関連性が高く、最も冗長な機能のサブセットが選択されます。近年、関連性や冗長性に加えて、特徴の相補性を考慮する必要があることが明らかになっている。形式的には、特徴がターゲット変数の弱い予測子であり、組み合わせた場合の強い予測子であるなら、それらは相補的である。本稿では,2層決定木の構築における相補的特徴の相互増幅による相乗効果を他の特徴に干渉させることによって性能が低下することを示す。合成データセットと実際のデータセットの相互評価、回帰と分類を用いて、干渉機能を削除または削除することで、パフォーマンスを最大24倍向上できることを実証する。また、ドメインが学習される量が少ないほど、パフォーマンスが向上することが判明している。より正式には、干渉特徴の除去前のデータセットの性能と干渉特徴の除去後の性能成長との間に統計的に有意な負のランク相関があることが示されている。これはデータと計算資源が十分である場合に特徴選択法の範囲を広げるものであると結論づける。

関連論文リスト

Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection [52.716143424856185]
部分モジュラー部分集合選択の最適化問題として重要領域の帰属を再構成するLiMA(Less input is more faithful for Attribution)を提案する。 LiMAは、エラーを最小限に抑える最適な帰属境界を確保しながら、最も重要かつ最も重要でないサンプルを識別する。また, 帰属効率が1.6倍に向上し, 帰属効率が向上した。
論文参考訳（メタデータ） (2025-04-01T06:58:15Z)
Fairness-Aware Streaming Feature Selection with Causal Graphs [10.644488289941021]
Streaming Feature Selection with Causal Fairness build causal graphs egocentric to predict label and protected feature。ストリーミング機能研究で広く使われている5つのデータセットに対して、SFCFをベンチマークする。
論文参考訳（メタデータ） (2024-08-17T00:41:02Z)
Causal Feature Selection via Transfer Entropy [59.999594949050596]
因果発見は、観察データによる特徴間の因果関係を特定することを目的としている。本稿では,前向きと後向きの機能選択に依存する新たな因果的特徴選択手法を提案する。精度および有限サンプルの場合の回帰誤差と分類誤差について理論的に保証する。
論文参考訳（メタデータ） (2023-10-17T08:04:45Z)
Copula for Instance-wise Feature Selection and Ranking [24.09326839818306]
本稿では,変数間の相関を捉える強力な数学的手法であるガウスコプラを,現在の特徴選択フレームワークに組み込むことを提案する。提案手法が有意な相関関係を捉えることができることを示すために, 合成データセットと実データセットの双方について, 性能比較と解釈可能性の観点から実験を行った。
論文参考訳（メタデータ） (2023-08-01T13:45:04Z)
Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction [52.63663547523033]
マルチベクトルの最も単純な形式である後期相互作用は、[]ベクトルのみを使用して類似度スコアを計算する神経リランカにも役立ちます。異なるモデルサイズと多様な性質の第一段階のレトリバーに一貫性があることが示される。
論文参考訳（メタデータ） (2023-02-13T18:42:17Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
本稿では,CSUFS (Compactness Score) と呼ばれる高速な教師なし特徴選択手法を提案する。提案アルゴリズムは既存のアルゴリズムよりも正確で効率的である。
論文参考訳（メタデータ） (2022-01-31T13:01:37Z)
Deep Unsupervised Feature Selection by Discarding Nuisance and Correlated Features [7.288137686773523]
現代のデータセットには、相関した特徴とニュアンスな特徴の大きなサブセットが含まれている。多数のニュアンス特徴が存在する場合、ラプラシアンは選択された特徴の部分集合上で計算されなければならない。相関する特徴に対処するためにオートエンコーダアーキテクチャを使用し、選択した特徴のサブセットからデータを再構成するように訓練した。
論文参考訳（メタデータ） (2021-10-11T14:26:13Z)
Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
本稿では,特徴分解ネットワークと対象画像分類モデルとを協調的に最適化する,PFDL(Partial Feature Deorrelation Learning)アルゴリズムを提案する。実世界のデータセットを用いた実験により,OOD画像分類データセットにおけるバックボーンモデルの精度が向上することを示した。
論文参考訳（メタデータ） (2020-07-30T05:48:48Z)
Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
本稿では,隣接レベルからの機能を統合するためのアグリゲート・インタラクション・モジュールを提案する。より効率的なマルチスケール機能を得るために、各デコーダユニットに自己相互作用モジュールを埋め込む。 5つのベンチマークデータセットによる実験結果から,提案手法は後処理を一切行わず,23の最先端手法に対して良好に動作することが示された。
論文参考訳（メタデータ） (2020-07-17T15:41:37Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
データ管理の統合コンポーネントにおける公平性について検討する。本稿では,データセットの公平性を保証する特徴のサブコレクションを同定する手法を提案する。
論文参考訳（メタデータ） (2020-06-10T20:20:10Z)
Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data [8.121462458089143]
本稿では,新しい多目的型次元還元フレームワークを提案する。これは、特徴選択と離散化を行うためのアンサンブルモデルとして、離散化と特徴縮小の両方を組み込んでいる。
論文参考訳（メタデータ） (2020-04-16T06:32:15Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。