Fugu-MT 論文翻訳(概要): Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition

論文の概要: Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition

arxiv url: http://arxiv.org/abs/2303.08356v2
Date: Mon, 17 Apr 2023 11:30:07 GMT
ステータス: 翻訳完了
システム内更新日: 2023-04-18 20:45:07.100640
Title: Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition
Title（参考訳）: 連続感情認識における視覚聴覚融合におけるttnとtransformerの活用
Authors: Weiwei Zhou, Jiada Lu, Zhaolong Xiong, Weifeng Wang
Abstract要約: 本稿では,Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, Action Unit (AU) Detection Challengeを提案する。本稿では、時間的畳み込みネットワーク(TCN)とトランスフォーマーを利用して、連続的な感情認識の性能を向上させる新しいマルチモーダル融合モデルを提案する。
参考スコア（独自算出の注目度）: 1.064167691614925
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human emotion recognition plays an important role in human-computer interaction. In this paper, we present our approach to the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge of the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Specifically, we propose a novel multi-modal fusion model that leverages Temporal Convolutional Networks (TCN) and Transformer to enhance the performance of continuous emotion recognition. Our model aims to effectively integrate visual and audio information for improved accuracy in recognizing emotions. Our model outperforms the baseline and ranks 3 in the Expression Classification challenge.
Abstract（参考訳）: 人間の感情認識は、人間とコンピュータの相互作用において重要な役割を果たす。本稿では,第5回ワークショップのvalence-arousal (va) estimation challenge, expression (expr) classification challenge, action unit (au) detection challenge, and competition on affective behavior analysis in-the-wild (abaw)について述べる。具体的には,時間的畳み込みネットワーク(tcn,temporal convolutional network)とトランスフォーマー(transformer)を利用して,連続的感情認識の性能を向上させるマルチモーダル融合モデルを提案する。本モデルは,感情認識の精度を向上させるため,視覚情報と音声情報を効果的に統合することを目的としている。我々のモデルはベースラインを上回り、表現分類チャレンジで3位になっている。

関連論文リスト

Enhancing Speech Emotion Recognition with Graph-Based Multimodal Fusion and Prosodic Features for the Speech Emotion Recognition in Naturalistic Conditions Challenge at Interspeech 2025 [64.59170359368699]
自然条件課題におけるInterSPEECH 2025音声感情認識のための頑健なシステムを提案する。提案手法は,最先端の音声モデルと韻律的・スペクトル的手法によって強化されたテキスト特徴を組み合わせる。
論文参考訳（メタデータ） (2025-06-02T13:46:02Z)
Emotion Recognition with CLIP and Sequential Learning [5.66758879852618]
本稿では,Valence-Arousal (VA) Estimation Challenge, Expression Recognition Challenge, and the Action Unit (AU) Detection Challengeについて述べる。本手法では,継続的な感情認識の促進を目的とした新しい枠組みを導入する。
論文参考訳（メタデータ） (2025-03-13T01:02:06Z)
Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers [3.951847822557829]
本研究では,Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, Action Unit (AU) Detection Challengeに取り組む。本研究は,継続的な感情認識を改善するための新しいアプローチを提唱する。我々は、顔データセット上でMasked Autoencoders(MAE)を事前トレーニングし、その後、式(Expr)ラベルを付加したaff-wild2データセットを微調整することで、これを実現する。
論文参考訳（メタデータ） (2024-03-18T03:28:01Z)
Affective Behaviour Analysis via Integrating Multi-Modal Knowledge [24.74463315135503]
ABAW(Affective Behavior Analysis in-wild)の第6回コンペティションでは、Aff-Wild2、Hum-Vidmimic2、C-EXPR-DBデータセットが使用されている。本稿では,Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, Emotional Mimicry Intensity (EMI) Estimationの5つの競合トラックについて提案する。
論文参考訳（メタデータ） (2024-03-16T06:26:43Z)
The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition [53.718777420180395]
本稿では,第6回ABAWコンペティションについて述べる。第6回ABAWコンペティションは、人間の感情や行動を理解する上での現代の課題に対処する。
論文参考訳（メタデータ） (2024-02-29T16:49:38Z)
Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion Recognition in Conversation With Emotion Disentanglement [8.17164107060944]
Emotion Recognition in Conversation (ERC) は自然言語処理分野で広く注目を集めている。既存のERC手法では、コンテキストのモデリングが不十分なため、様々なシナリオへの一般化が困難である。本稿では,これらの課題に対処するハイブリッド連続帰属ネットワーク(HCAN)について,感情的継続と感情的帰属の観点から紹介する。
論文参考訳（メタデータ） (2023-09-18T14:18:16Z)
EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation [34.24557248359872]
本稿では,ERCタスクに対する感情的慣性・伝染型依存性モデリング手法(EmotionIC)を提案する。 EmotionICは3つの主要コンポーネント、すなわちIDマスク付きマルチヘッド注意(IMMHA)、対話型Gated Recurrent Unit(DiaGRU)、Skip-chain Conditional Random Field(SkipCRF)から構成されている。実験結果から,提案手法は4つのベンチマークデータセットにおいて,最先端のモデルよりも大幅に優れていることが示された。
論文参考訳（メタデータ） (2023-03-20T13:58:35Z)
Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers [47.16005553291036]
我々は,野生(ABAW)2023における2つの影響行動分析のサブチャレンジに対して,その解決策を提示する。表現分類チャレンジでは,分類の課題を効果的に処理する合理化アプローチを提案する。これらの特徴を研究、分析、組み合わせることで、マルチモーダルコンテキストにおける感情予測のためのモデルの精度を大幅に向上させる。
論文参考訳（メタデータ） (2023-03-16T09:03:17Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
本稿では,音声とテキストのモダリティから,伝達学習モデルと微調整モデルとを融合したニューラルネットワークによる感情認識フレームワークを提案する。本稿では,対話型感情的モーションキャプチャー・データセットにおけるマルチモーダル・アプローチの有効性を評価する。
論文参考訳（メタデータ） (2022-02-16T00:23:42Z)
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition [118.73025093045652]
マルチモーダル感情認識のための事前学習モデル textbfMEmoBERT を提案する。従来の「訓練前、微妙な」パラダイムとは異なり、下流の感情分類タスクをマスク付きテキスト予測として再構成するプロンプトベースの手法を提案する。提案するMEMOBERTは感情認識性能を大幅に向上させる。
論文参考訳（メタデータ） (2021-10-27T09:57:00Z)
Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation [56.264157127549446]
音声感情認識(SER)は、人間とコンピュータの相互作用において重要な役割を果たす課題である。 SERの主な課題の1つは、データの不足である。本稿では,スペクトログラム拡張と併用した移動学習戦略を提案する。
論文参考訳（メタデータ） (2021-08-05T10:39:39Z)
Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor [70.2226417364135]
マシンはユーザの感情状態を高い精度で認識できることが不可欠である。ディープニューラルネットワークは感情を認識する上で大きな成功を収めている。表情認識に基づく連続的感情認識のための新しいモデルを提案する。
論文参考訳（メタデータ） (2020-01-31T17:47:16Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。