Fugu-MT 論文翻訳(概要): Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers

論文の概要: Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers

arxiv url: http://arxiv.org/abs/2303.09164v1
Date: Thu, 16 Mar 2023 09:03:17 GMT
ステータス: 翻訳完了
システム内更新日: 2023-03-17 16:17:48.353839
Title: Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers
Title（参考訳）: トランスフォーマー付きビデオにおける感情反応強度推定と表現分類のためのマルチモーダル特徴抽出と融合
Authors: Jia Li, Yin Chen, Xuesong Zhang, Jiantao Nie, Yangchen Yu, Ziqiang Li, Meng Wang, Richang Hong
Abstract要約: 我々は,野生(ABAW)2023における2つの影響行動分析のサブチャレンジに対して,その解決策を提示する。 The Emotional Reaction Intensity (ERI) Estimation Challenge, our method showed excellent results with a Pearson coefficient on the validation dataset, compare the baseline method by 84%。
参考スコア（独自算出の注目度）: 46.96090775164395
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present our solutions to the two sub-challenges of Affective Behavior Analysis in the wild (ABAW) 2023: the Emotional Reaction Intensity (ERI) Estimation Challenge and Expression (Expr) Classification Challenge. ABAW 2023 focuses on the problem of affective behavior analysis in the wild, with the goal of creating machines and robots that have the ability to understand human feelings, emotions and behaviors, which can effectively contribute to the advent of a more intelligent future. In our work, we use different models and tools for the Hume-Reaction dataset to extract features of various aspects, such as audio features, video features, etc. By analyzing, combining, and studying these multimodal features, we effectively improve the accuracy of the model for multimodal sentiment prediction. For the Emotional Reaction Intensity (ERI) Estimation Challenge, our method shows excellent results with a Pearson coefficient on the validation dataset, exceeding the baseline method by 84 percent.
Abstract（参考訳）: 本稿では,野生(abaw)2023年における情動行動分析の2つの下位課題である,感情反応強度(eri)推定チャレンジと表現(expr)分類チャレンジの解決法を提案する。 abaw 2023は、人間の感情、感情、行動を理解する能力を持つ機械やロボットを作ることを目標とし、よりインテリジェントな未来の実現に効果的に寄与する、野生の情動行動分析の問題に焦点を当てている。本研究では,hume-reactionデータセットのための異なるモデルとツールを使用して,オーディオ機能やビデオ機能など,さまざまな側面の機能を抽出する。これらのマルチモーダル特徴を分析し,結合し,検討することにより,マルチモーダル感情予測のためのモデルの精度を効果的に向上させる。感情反応強度 (eri) 推定チャレンジでは, 検証データセット上でピアソン係数を84%上回り, 良好な結果を示した。

関連論文リスト

MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis [53.012111671763776]
そこで本研究では、7,145枚の肖像画からなる総合的なベンチマークであるMEMO-Benchを紹介した。以上の結果から,既存のT2Iモデルは負のモデルよりも肯定的な感情を生成するのに効果的であることが示唆された。 MLLMは人間の感情の識別と認識に一定の効果を示すが、人間のレベルの正確さには欠ける。
論文参考訳（メタデータ） (2024-11-18T02:09:48Z)
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning [55.127202990679976]
28,618粒の粗粒と4,487粒の細粒のアノテートサンプルを含むMERRデータセットを導入した。このデータセットは、さまざまなシナリオから学習し、現実のアプリケーションに一般化することを可能にする。本研究では,感情特異的エンコーダによる音声,視覚,テキスト入力をシームレスに統合するモデルであるEmotion-LLaMAを提案する。
論文参考訳（メタデータ） (2024-06-17T03:01:22Z)
Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences [4.740624855896404]
自己教師型歩行に基づく感情表現のための選択的強強化を利用したコントラスト学習フレームワークを提案する。提案手法はEmotion-Gait (E-Gait) と Emilya のデータセットで検証され, 異なる評価プロトコル下での最先端手法よりも優れている。
論文参考訳（メタデータ） (2024-05-08T09:13:10Z)
Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations [15.705757672984662]
会話におけるマルチモーダル感情認識(MERC)は、マシンインテリジェンスにとって重要な開発方向である。 MERCのデータの多くは自然に感情カテゴリーの不均衡な分布を示しており、研究者は感情認識に対する不均衡なデータの負の影響を無視している。生データにおける感情カテゴリーの不均衡分布に対処するクラス境界拡張表現学習(CBERL)モデルを提案する。我々は,IEMOCAPおよびMELDベンチマークデータセットの広範な実験を行い,CBERLが感情認識の有効性において一定の性能向上を達成したことを示す。
論文参考訳（メタデータ） (2023-12-11T12:35:17Z)
A Dual Branch Network for Emotional Reaction Intensity Estimation [12.677143408225167]
両分岐型マルチアウトプット回帰モデルであるABAW(Affective Behavior Analysis in-wild)のERI問題に対する解法を提案する。空間的注意は視覚的特徴をよりよく抽出するために使用され、Mel-Frequency Cepstral Coefficients技術は音響的特徴を抽出する。本手法は,公式な検証セットにおいて優れた結果が得られる。
論文参考訳（メタデータ） (2023-03-16T10:31:40Z)
Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition [0.5370906227996627]
本稿では,Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, Action Unit (AU) Detection Challengeを提案する。本稿では、時間的畳み込みネットワーク(TCN)とトランスフォーマーを利用して、連続的な感情認識の性能を向上させる新しいマルチモーダル融合モデルを提案する。
論文参考訳（メタデータ） (2023-03-15T04:15:57Z)
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition [72.36055502078193]
本稿では,声帯からの感情認識のための連鎖回帰モデルに基づく階層的枠組みを提案する。データスパシティの課題に対処するため、レイヤワイドおよび時間アグリゲーションモジュールを備えた自己教師付き学習(SSL)表現も使用しています。提案されたシステムは、ACII Affective Vocal Burst (A-VB) Challenge 2022に参加し、「TWO」および「CULTURE」タスクで第1位となった。
論文参考訳（メタデータ） (2023-03-14T16:08:45Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
本稿では,音声とテキストのモダリティから,伝達学習モデルと微調整モデルとを融合したニューラルネットワークによる感情認識フレームワークを提案する。本稿では,対話型感情的モーションキャプチャー・データセットにおけるマルチモーダル・アプローチの有効性を評価する。
論文参考訳（メタデータ） (2022-02-16T00:23:42Z)
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition [118.73025093045652]
マルチモーダル感情認識のための事前学習モデル textbfMEmoBERT を提案する。従来の「訓練前、微妙な」パラダイムとは異なり、下流の感情分類タスクをマスク付きテキスト予測として再構成するプロンプトベースの手法を提案する。提案するMEMOBERTは感情認識性能を大幅に向上させる。
論文参考訳（メタデータ） (2021-10-27T09:57:00Z)
Affective Expression Analysis in-the-wild using Multi-Task Temporal Statistical Deep Learning Model [6.024865915538501]
上記の課題に対処する感情表現分析モデルを提案する。 ABAW Challengeのための大規模データセットであるAff-Wild2データセットを実験した。
論文参考訳（メタデータ） (2020-02-21T04:06:03Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。