Fugu-MT 論文翻訳(概要): Talking Slide Avatars: Open-Source Multimodal Communication Approach for Teaching

論文の概要: Talking Slide Avatars: Open-Source Multimodal Communication Approach for Teaching

arxiv url: http://arxiv.org/abs/2604.23703v1
Date: Sun, 26 Apr 2026 13:36:45 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.509066
Title: Talking Slide Avatars: Open-Source Multimodal Communication Approach for Teaching
Title（参考訳）: スライドアバター:教育のためのオープンソースのマルチモーダルコミュニケーションアプローチ
Authors: Xinxing Wu,
Abstract要約: 本研究では,スライドベースの授業用スライドアバターを作成するためのオープンソースワークフローの実践に基づく分析を行った。この研究は、デジタル教育、美学教育、アートテクノロジーの実践の交差点で、多モーダルコミュニケーションアーティファクトとしてスライドアバターを話している。
参考スコア（独自算出の注目度）: 7.927674438432626
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Slide-based teaching is widely used in higher education, yet in online, hybrid, and asynchronous contexts, slides often lose the instructor presence, narrative continuity, and expressive framing that help learners connect with content. Full lecture video can partly restore these qualities, but it is time-consuming to record, revise, and reuse. This study addresses that pedagogical and production challenge by presenting a practice-based analysis of an open-source workflow for creating talking slide avatars for slide-based teaching. The workflow integrates OpenVoice for text-to-speech generation and voice cloning with Ditto-TalkingHead for audio-driven talking-image synthesis, enabling instructors to transform a script and a static portrait into a short narrated video that can be embedded in slide decks or HTML-based lecture materials. Rather than treating this workflow merely as a technical solution, the study frames talking slide avatars as multimodal communication artifacts at the intersection of digital pedagogy, aesthetic education, and art-technology practice. Using a practice-based implementation and analytic reflection approach, the study documents the production pipeline, examines its communicative and aesthetic affordances, and proposes practical guidelines for script length, image selection, pacing, disclosure, accessibility, and ethical use. The study makes three primary contributions: it presents an educator-oriented open-source production model, reframes talking avatars as an educational communication design problem, and proposes a responsible pathway for incorporating generative synthetic media into teaching. It concludes that short, transparent, and carefully designed avatars can humanize slide-based instruction while providing a reusable communicative layer for introductions, transitions, reminders, and recaps across online, hybrid, and asynchronous learning environments.
Abstract（参考訳）: スライドベースの教育は高等教育において広く使われているが、オンライン、ハイブリッド、非同期の文脈では、スライドはインストラクターの存在、物語の連続性、そして学習者がコンテンツと結びつくのに役立つ表現的フレーミングを失うことが多い。完全な講義ビデオは、これらの品質を部分的に復元することができるが、記録、修正、再利用には時間がかかる。本研究では,スライドベースの授業用スライドアバターを作成するためのオープンソースのワークフローを実践ベースで分析することで,教育的かつ生産的な課題に対処する。このワークフローはOpenVoiceを統合し、テキスト音声生成と音声クローンをDitto-TalkingHeadと統合し、音声駆動の音声画像合成を可能にし、インストラクターはスクリプトと静的なポートレートをスライドデッキやHTMLベースの講義資料に埋め込まれた短いナレーションビデオに変換することができる。このワークフローを単に技術的な解決策として扱うのではなく、デジタル教育、美学教育、アート・テクノロジーの実践の交差点で、スライドアバターをマルチモーダルなコミュニケーションアーティファクトとして話すことを目的としている。実践に基づく実装と分析的リフレクション手法を用いて、本研究では、生産パイプラインを文書化し、そのコミュニケーション性および美的余裕を検証し、スクリプト長、画像選択、ペーシング、開示、アクセシビリティ、倫理的使用に関する実践的ガイドラインを提案する。本研究は, 教育者指向のオープンソース生産モデル, アバターを教育コミュニケーション設計問題として再編成し, 生成的合成メディアを教育に組み込むための責任ある経路を提案する。簡単に、透明で、慎重に設計されたアバターは、オンライン、ハイブリッド、非同期学習環境にまたがる導入、移行、リマインダ、再カプセルのための再利用可能なコミュニケーションレイヤを提供しながら、スライドベースの命令を人間化することができる、と結論付けている。

論文の概要: Talking Slide Avatars: Open-Source Multimodal Communication Approach for Teaching

関連論文リスト