Fugu-MT 論文翻訳(概要): Understanding and Enhancing Robustness of Concept-based Models

論文の概要: Understanding and Enhancing Robustness of Concept-based Models

arxiv url: http://arxiv.org/abs/2211.16080v1
Date: Tue, 29 Nov 2022 10:43:51 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-30 14:59:51.865666
Title: Understanding and Enhancing Robustness of Concept-based Models
Title（参考訳）: 概念に基づくモデルのロバスト性理解と強化
Authors: Sanchit Sinha, Mengdi Huai, Jianhui Sun, Aidong Zhang
Abstract要約: 対向摂動に対する概念ベースモデルの堅牢性について検討する。本稿では、まず、概念ベースモデルのセキュリティ脆弱性を評価するために、さまざまな悪意ある攻撃を提案し、分析する。そこで我々は,これらのシステムのロバスト性を高めるための,汎用的対人訓練に基づく防御機構を提案する。
参考スコア（独自算出の注目度）: 41.20004311158688
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rising usage of deep neural networks to perform decision making in critical applications like medical diagnosis and financial analysis have raised concerns regarding their reliability and trustworthiness. As automated systems become more mainstream, it is important their decisions be transparent, reliable and understandable by humans for better trust and confidence. To this effect, concept-based models such as Concept Bottleneck Models (CBMs) and Self-Explaining Neural Networks (SENN) have been proposed which constrain the latent space of a model to represent high level concepts easily understood by domain experts in the field. Although concept-based models promise a good approach to both increasing explainability and reliability, it is yet to be shown if they demonstrate robustness and output consistent concepts under systematic perturbations to their inputs. To better understand performance of concept-based models on curated malicious samples, in this paper, we aim to study their robustness to adversarial perturbations, which are also known as the imperceptible changes to the input data that are crafted by an attacker to fool a well-learned concept-based model. Specifically, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models. Subsequently, we propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks. Extensive experiments on one synthetic and two real-world datasets demonstrate the effectiveness of the proposed attacks and the defense approach.
Abstract（参考訳）: 医療診断や財務分析といった重要な応用において、決定を行うためにディープニューラルネットワークの使用が増加し、信頼性と信頼性に関する懸念が高まっている。自動化システムがより主流になるにつれて、その決定は透明性があり、信頼性があり、人間によって理解され、より良い信頼と信頼が得られます。この効果のために、概念ボトルネックモデル(cbms)や自己説明ニューラルネットワーク(senn)といった概念に基づくモデルが提案されており、この分野のドメインエキスパートが理解しやすいハイレベルな概念を表現するためにモデルの潜在空間を制約している。概念に基づくモデルは、説明可能性の向上と信頼性の向上の両方に優れたアプローチを約束するが、それらが体系的な摂動の下で堅牢性を示し、一貫した概念を出力するかどうかはまだ明らかになっていない。本稿では,悪意のあるサンプルに対する概念ベースモデルの性能をよりよく理解するために,攻撃者が概念ベースモデルを騙すために作成した入力データに対する不可避な変化としても知られる,敵の摂動に対するロバスト性について検討することを目的とする。具体的には、概念に基づくモデルのセキュリティ脆弱性を評価するために、まず異なる悪意のある攻撃を提案し分析する。続いて,提案する悪意攻撃に対するシステムの頑健性を高めるための,一般的な攻撃訓練に基づく防御機構を提案する。 1つの合成データセットと2つの実世界のデータセットに関する広範な実験は、提案された攻撃と防御アプローチの有効性を示している。

関連論文リスト

A Comprehensive Survey on the Risks and Limitations of Concept-based Models [33.641361996627175]
概念ベースのモデルは、基本的に標準的なディープニューラルネットワークを改善するための説明可能なネットワークである。これらのモデルは、医療診断や金融リスク予測といった重要な応用において非常に成功している。しかし、近年の研究でそのようなネットワークの構造に重大な制限があることが判明した。
論文参考訳（メタデータ） (2025-05-25T03:53:26Z)
What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift [33.83306492023009]
ConceptLensは、トレーニング済みのマルチモーダルモデルを利用して、整合性の脅威を特定する汎用フレームワークである。悪意のあるコンセプトシフトによる隠蔽広告の生成など、バイアス注入に対する脆弱性を明らかにする。生成的コンテンツにおける社会学的バイアスを明らかにし、社会学的文脈にまたがる格差を明らかにする。
論文参考訳（メタデータ） (2025-04-28T13:30:48Z)
A biologically Inspired Trust Model for Open Multi-Agent Systems that is Resilient to Rapid Performance Fluctuations [0.0]
既存の信頼モデルは、エージェントモビリティ、振る舞いの変化、コールドスタート問題に関連する課題に直面します。我々は,信頼者が自身の能力を評価し,信頼データをローカルに保存する,生物学的にインスパイアされた信頼モデルを導入する。この設計はモビリティサポートを改善し、通信オーバーヘッドを減らし、偽情報に抵抗し、プライバシーを保護する。
論文参考訳（メタデータ） (2025-04-17T08:21:54Z)
Causally Reliable Concept Bottleneck Models [4.411356026951205]
我々はemphCausally reliable Concept Bottleneck Models (C$2$BMs)を提案する。 C$2$BMsは実世界の因果メカニズムのモデルに従って構成された概念のボトルネックを通じて推論を強制する。 C$2$BMはより解釈可能で、因果的信頼性があり、標準不透明モデルやコンセプトベースモデルのような介入に対する応答性を向上させる。
論文参考訳（メタデータ） (2025-03-06T12:06:54Z)
Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model [22.865870813626316]
概念ボトルネックモデル(Concept Bottleneck Models, CBM)は、人間の理解可能な概念を意思決定の中間体として予測することにより、解釈可能性を高めることを目的としている。概念に関係のない特徴に対する感受性と、異なるサンプルの同じ概念に対する意味的一貫性の欠如である。本稿では,Reliability-Enhanced Concept Embedding Model (RECEM) を提案する。Reliability-Enhanced Concept Embedding Model (RECEM) は2つの戦略を導入する。
論文参考訳（メタデータ） (2025-02-03T09:29:39Z)
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [49.60774626839712]
マルチモーダル生成モデルは彼らの公正さ、信頼性、そして誤用の可能性について批判的な議論を呼んだ組込み空間における摂動に対する応答を通じてモデルの信頼性を評価するための評価フレームワークを提案する。本手法は, 信頼できない, バイアス注入されたモデルを検出し, バイアス前駆体の検索を行うための基礎となる。
論文参考訳（メタデータ） (2024-11-21T09:46:55Z)
A Framework for Strategic Discovery of Credible Neural Network Surrogate Models under Uncertainty [0.0]
本研究では,Occam Plausibility Algorithm for surrogate model (OPAL-surrogate)を提案する。 OPAL-surrogateは、予測ニューラルネットワークベースのサロゲートモデルを明らかにするための、体系的なフレームワークを提供する。モデルの複雑さ、正確性、予測の不確実性の間のトレードオフをバランスさせる。
論文参考訳（メタデータ） (2024-03-13T18:45:51Z)
NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness [0.0]
我々は,AIモデルの信頼性と信頼性を検証するツールであるNeuralSentinel(NS)を提案する。 NSは、モデル決定を理解することによって、専門家以外のスタッフがこの新しいシステムに対する信頼を高めるのに役立つ。このツールはハッカソンイベントにデプロイされ、皮膚がん画像検出器の信頼性を評価するために使用された。
論文参考訳（メタデータ） (2024-02-12T09:24:34Z)
Boosting Adversarial Robustness using Feature Level Stochastic Smoothing [46.86097477465267]
敵の防御は、ディープニューラルネットワークの堅牢性を大幅に向上させた。本研究では,ネットワーク予測における導入性に関する一般的な手法を提案する。また、信頼性の低い予測を拒否する意思決定の円滑化にも活用する。
論文参考訳（メタデータ） (2023-06-10T15:11:24Z)
Concept Embedding Models [27.968589555078328]
概念ボトルネックモデルは、人間のような概念の中間レベルに分類タスクを条件付けすることで、信頼性を促進する。既存の概念ボトルネックモデルは、高いタスク精度、堅牢な概念に基づく説明、概念に対する効果的な介入の間の最適な妥協を見つけることができない。本稿では,解釈可能な高次元概念表現を学習することで,現在の精度-vs-解釈可能性トレードオフを超える新しい概念ボトルネックモデルであるConcept Embedding Modelsを提案する。
論文参考訳（メタデータ） (2022-09-19T14:49:36Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
自動対物生成は、生成した対物インスタンスのいくつかの側面を考慮すべきである。本稿では, 対実例生成のための新しい枠組みを提案する。
論文参考訳（メタデータ） (2022-05-20T15:02:53Z)
A Unified Contrastive Energy-based Model for Understanding the Generative Ability of Adversarial Training [64.71254710803368]
Adversarial Training (AT) は、ディープニューラルネットワークの堅牢性を高める効果的なアプローチである。我々は、Contrastive Energy-based Models(CEM)と呼ばれる統合確率的枠組みを開発することにより、この現象をデミステレーションする。本稿では,逆学習法とサンプリング法を開発するための原則的手法を提案する。
論文参考訳（メタデータ） (2022-03-25T05:33:34Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
予測信頼性尺度は統計学と機械学習において基本的なものである。これらの措置は、実際に使用される多種多様なモデルを考慮に入れるべきである。この研究で開発されたフレームワークは、リスクフィットのトレードオフとして信頼性を表現している。
論文参考訳（メタデータ） (2020-11-24T19:52:38Z)
A general framework for defining and optimizing robustness [74.67016173858497]
分類器の様々な種類の堅牢性を定義するための厳密でフレキシブルなフレームワークを提案する。我々の概念は、分類器の堅牢性は正確性とは無関係な性質と考えるべきであるという仮定に基づいている。我々は,任意の分類モデルに適用可能な,非常に一般的なロバスト性フレームワークを開発する。
論文参考訳（メタデータ） (2020-06-19T13:24:20Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。