Fugu-MT 論文翻訳(概要): Channel-wise Knowledge Distillation for Dense Prediction

論文の概要: Channel-wise Knowledge Distillation for Dense Prediction

arxiv url: http://arxiv.org/abs/2011.13256v4
Date: Fri, 27 Aug 2021 03:05:25 GMT
ステータス: 翻訳完了
システム内更新日: 2022-09-20 09:05:52.140987
Title: Channel-wise Knowledge Distillation for Dense Prediction
Title（参考訳）: ディエンス予測のためのチャネル知識蒸留
Authors: Changyong Shu, Yifan Liu, Jianfei Gao, Zheng Yan, Chunhua Shen
Abstract要約: 本稿では,学生ネットワークと教師ネットワークのチャンネルワイズ機能について提案する。様々なネットワーク構造を持つ3つのベンチマークにおいて、一貫して優れた性能を実現している。
参考スコア（独自算出の注目度）: 73.99057249472735
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Knowledge distillation (KD) has been proven to be a simple and effective tool for training compact models. Almost all KD variants for dense prediction tasks align the student and teacher networks' feature maps in the spatial domain, typically by minimizing point-wise and/or pair-wise discrepancy. Observing that in semantic segmentation, some layers' feature activations of each channel tend to encode saliency of scene categories (analogue to class activation mapping), we propose to align features channel-wise between the student and teacher networks. To this end, we first transform the feature map of each channel into a probabilty map using softmax normalization, and then minimize the Kullback-Leibler (KL) divergence of the corresponding channels of the two networks. By doing so, our method focuses on mimicking the soft distributions of channels between networks. In particular, the KL divergence enables learning to pay more attention to the most salient regions of the channel-wise maps, presumably corresponding to the most useful signals for semantic segmentation. Experiments demonstrate that our channel-wise distillation outperforms almost all existing spatial distillation methods for semantic segmentation considerably, and requires less computational cost during training. We consistently achieve superior performance on three benchmarks with various network structures. Code is available at: https://git.io/Distiller
Abstract（参考訳）: 知識蒸留(KD)は、コンパクトモデルを訓練するためのシンプルで効果的なツールであることが証明されている。密接な予測タスクのためのほとんどすべてのKD変種は、通常、ポイントワイドおよび/またはペアワイドの差を最小化することによって、学生と教師ネットワークの空間領域における特徴写像を整列させる。意味的セグメンテーションにおいて,各チャネルのレイヤの特徴活性化は,シーンカテゴリの塩分をエンコードする傾向(クラスアクティベーションマッピングを例に)から,生徒と教師ネットワークのチャネルごとに特徴を整合させることが提案されている。この目的のために、まず、各チャネルの特徴マップをsoftmax正規化を用いて確率マップに変換し、それから2つのネットワークの対応するチャネルのkullback-leibler(kl)分岐を最小化する。そこで本手法は,ネットワーク間のチャネルのソフトな分布を模倣することに焦点を当てた。特に、KLの発散は、おそらくセマンティックセグメンテーションにおいて最も有用な信号に対応するチャネルワイドマップの最も健全な領域に、学習がより注意を払うことを可能にする。実験により, チャネルワイド蒸留は, セマンティックセグメンテーションにおいて, 既存の空間蒸留法よりもかなり優れており, 訓練の際の計算コストの低減を図っている。様々なネットワーク構造を持つ3つのベンチマークにおいて、一貫して優れた性能を達成する。コードは: https://git.io/distiller

関連論文リスト

DADU: Dual Attention-based Deep Supervised UNet for Automated Semantic Segmentation of Cardiac Images [0.0]
心磁気共鳴(CMR)画像から左心室と心筋の傷部組織を画像分割する深層学習モデルを提案する。提案手法は,UNet,チャネルおよび空間的注意,エッジ検出に基づくスキップ接続,深層教師あり学習を統合し,CMR画像の精度を向上させる。
論文参考訳（メタデータ） (2025-04-18T02:22:45Z)
Distilling Channels for Efficient Deep Tracking [68.13422829310835]
本稿では,ディープトラッカーを容易にするための新しいチャネル蒸留法を提案する。統合的な定式化は,特徴圧縮,応答マップ生成,モデル更新を統一エネルギー最小化問題に変換することができることを示す。その結果、ディープトラッカーは正確で高速で、メモリ要求が低い。
論文参考訳（メタデータ） (2024-09-18T08:09:20Z)
Group channel pruning and spatial attention distilling for object detection [2.8675002818821542]
動的スパーストレーニング,グループチャネルプルーニング,空間アテンション蒸留という3段階モデル圧縮手法を提案する。本手法は,モデルのパラメータを64.7%削減し,計算量を34.9%削減する。
論文参考訳（メタデータ） (2023-06-02T13:26:23Z)
Fully Attentional Network for Semantic Segmentation [17.24768249911501]
単一類似マップにおいて,空間的注意とチャネル的注意の両方を符号化するフルアテンショナル・ネットワーク(FLANet)を提案する。我々の新しい手法は3つの挑戦的セマンティックセグメンテーションデータセットに対して最先端の性能を達成した。
論文参考訳（メタデータ） (2021-12-08T04:34:55Z)
Group Fisher Pruning for Practical Network Compression [58.25776612812883]
本稿では,様々な複雑な構造に応用可能な汎用チャネルプルーニング手法を提案する。我々は、単一チャネルと結合チャネルの重要性を評価するために、フィッシャー情報に基づく統一されたメトリクスを導出する。提案手法は,結合チャネルを含む任意の構造をプルークするために利用できる。
論文参考訳（メタデータ） (2021-08-02T08:21:44Z)
Operation-Aware Soft Channel Pruning using Differentiable Masks [51.04085547997066]
本稿では,データ駆動型アルゴリズムを提案する。このアルゴリズムは,操作特性を利用して,ディープニューラルネットワークを異なる方法で圧縮する。我々は大規模な実験を行い、出力ネットワークの精度で優れた性能を達成する。
論文参考訳（メタデータ） (2020-07-08T07:44:00Z)
DMCP: Differentiable Markov Channel Pruning for Neural Networks [67.51334229530273]
DMCP (diffariable Markov Channel Pruning) と命名された新しいチャネルプルーニング法を提案する。本手法は微分可能であり,標準タスク損失や予算正規化に関して,勾配勾配により直接最適化することができる。提案手法の有効性を検証するため,ResNet と MobilenetV2 を用いたImagenet 実験を行った。
論文参考訳（メタデータ） (2020-05-07T09:39:55Z)
Channel Interaction Networks for Fine-Grained Image Categorization [61.095320862647476]
微妙なクラス間差のため、きめ細かい画像分類は困難である。本稿では,チャネル・インタラクション・ネットワーク(CIN)を提案する。我々のモデルは、多段階のトレーニングやテストを必要とせずに、エンドツーエンドで効率的にトレーニングすることができる。
論文参考訳（メタデータ） (2020-03-11T11:51:51Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。