Fugu-MT 論文翻訳(概要): BERT-VQA: Visual Question Answering on Plots

論文の概要: BERT-VQA: Visual Question Answering on Plots

arxiv url: http://arxiv.org/abs/2508.13184v1
Date: Thu, 14 Aug 2025 00:55:18 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-20 15:36:31.635913
Title: BERT-VQA: Visual Question Answering on Plots
Title（参考訳）: BERT-VQA: プロットに対する視覚的質問応答
Authors: Tai Vu, Robert Yang,
Abstract要約: ResNet 101イメージエンコーダを事前訓練した VisualBERT ベースのモデルアーキテクチャであるBERT-VQA を開発した。 LSTM, CNN, 浅い分類器からなるベースラインに対して, 本モデルを訓練し, 評価した。最終結果は、VisualBERTのクロスモダリティモジュールがプロットコンポーネントと質問句の整合に不可欠である、という私たちのコア仮説を覆した。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual question answering has been an exciting challenge in the field of natural language understanding, as it requires deep learning models to exchange information from both vision and language domains. In this project, we aim to tackle a subtask of this problem, namely visual question answering on plots. To achieve this, we developed BERT-VQA, a VisualBERT-based model architecture with a pretrained ResNet 101 image encoder, along with a potential addition of joint fusion. We trained and evaluated this model against a baseline that consisted of a LSTM, a CNN, and a shallow classifier. The final outcome disproved our core hypothesis that the cross-modality module in VisualBERT is essential in aligning plot components with question phrases. Therefore, our work provided valuable insights into the difficulty of the plot question answering challenge as well as the appropriateness of different model architectures in solving this problem.
Abstract（参考訳）: 視覚的質問応答は、視覚領域と言語領域の両方から情報を交換するために、ディープラーニングモデルを必要とするため、自然言語理解の分野でエキサイティングな課題となっている。本プロジェクトでは,プロットに対する視覚的質問応答という,この問題のサブタスクに取り組むことを目的としている。これを実現するために,既存のResNet 101イメージエンコーダを備えたVisualBERTベースのモデルアーキテクチャであるBERT-VQAを開発した。 LSTM, CNN, 浅い分類器からなるベースラインに対して, 本モデルを訓練し, 評価した。最終結果は、VisualBERTのクロスモダリティモジュールがプロットコンポーネントと質問句の整合に不可欠である、という私たちのコア仮説を覆した。そこで本研究では,プロット質問応答課題の難しさと,この問題の解決における異なるモデルアーキテクチャの適切性について,貴重な知見を得た。

論文の概要: BERT-VQA: Visual Question Answering on Plots

関連論文リスト