AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
- URL: http://arxiv.org/abs/2506.23605v1
- Date: Mon, 30 Jun 2025 08:11:31 GMT
- Title: AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
- Authors: Suyash Maniyar, Vishvesh Trivedi, Ajoy Mondal, Anand Mishra, C. V. Jawahar,
- Abstract summary: We propose a large language model (LLM)-guided synthetic lecture slide generation pipeline, SynLecSlideGen.<n>We also create an evaluation benchmark, namely RealSlide by manually annotating 1,050 real lecture slides.<n> Experimental results show that few-shot transfer learning with pretraining on synthetic slides significantly improves performance compared to training only on real data.
- Score: 25.517836483457803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lecture slide element detection and retrieval are key problems in slide understanding. Training effective models for these tasks often depends on extensive manual annotation. However, annotating large volumes of lecture slides for supervised training is labor intensive and requires domain expertise. To address this, we propose a large language model (LLM)-guided synthetic lecture slide generation pipeline, SynLecSlideGen, which produces high-quality, coherent and realistic slides. We also create an evaluation benchmark, namely RealSlide by manually annotating 1,050 real lecture slides. To assess the utility of our synthetic slides, we perform few-shot transfer learning on real data using models pre-trained on them. Experimental results show that few-shot transfer learning with pretraining on synthetic slides significantly improves performance compared to training only on real data. This demonstrates that synthetic data can effectively compensate for limited labeled lecture slides. The code and resources of our work are publicly available on our project website: https://synslidegen.github.io/.
Related papers
- SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design [33.47715901943206]
We introduce SlideCoder, a layout-aware, retrieval-augmented framework for generating editable slides from reference images.<n> Experiments show that SlideCoder outperforms state-of-the-art baselines by up to 40.5 points, demonstrating strong performance across layout fidelity, execution accuracy, and visual consistency.
arXiv Detail & Related papers (2025-06-09T17:39:48Z) - Talk to Your Slides: Language-Driven Agents for Efficient Slide Editing [28.792459459465515]
We propose Talk-to-Your-Slides, an agent to edit slides %in active PowerPoint sessions.<n>Our system enables 34.02% faster processing, 34.76% better instruction fidelity, and 87.42% cheaper operation than baselines.
arXiv Detail & Related papers (2025-05-16T18:12:26Z) - Generating Narrated Lecture Videos from Slides with Synchronized Highlights [55.2480439325792]
We introduce an end-to-end system designed to automate the process of turning static slides into video lectures.<n>This system synthesizes a video lecture featuring AI-generated narration precisely synchronized with dynamic visual highlights.<n>We demonstrate the system's effectiveness through a technical evaluation using a manually annotated slide dataset with 1000 samples.
arXiv Detail & Related papers (2025-05-05T18:51:53Z) - PASS: Presentation Automation for Slide Generation and Speech [0.0]
PASS is a pipeline used to generate slides from general Word documents.<n>It also automates the oral delivery of the generated slides.<n>Pass analyzes user documents to create a dynamic, engaging presentation with an AI-generated voice.
arXiv Detail & Related papers (2025-01-11T10:22:04Z) - AutoPresent: Designing Structured Visuals from Scratch [99.766901203884]
We benchmark end-to-end image generation and program generation methods with a variety of models.<n>We create AutoPresent, an 8B Llama-based model trained on 7k pairs of instructions paired with code for slide generation.
arXiv Detail & Related papers (2025-01-01T18:09:32Z) - Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination [52.20542825755132]
We develop Slide2Lecture, a tuning-free and knowledge-regulated intelligent tutoring system.
It can effectively convert an input lecture slide into a structured teaching agenda consisting of a set of heterogeneous teaching actions.
For teachers and developers, Slide2Lecture enables customization to cater to personalized demands.
arXiv Detail & Related papers (2024-09-11T16:03:09Z) - Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame.
ATM outperforms strong video pre-training baselines by 80% on average.
We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z) - Slideflow: Deep Learning for Digital Histopathology with Real-Time
Whole-Slide Visualization [49.62449457005743]
We develop a flexible deep learning library for histopathology called Slideflow.
It supports a broad array of deep learning methods for digital pathology.
It includes a fast whole-slide interface for deploying trained models.
arXiv Detail & Related papers (2023-04-09T02:49:36Z) - A Scaling Law for Synthetic-to-Real Transfer: A Measure of Pre-Training [52.93808218720784]
Synthetic-to-real transfer learning is a framework in which we pre-train models with synthetically generated images and ground-truth annotations for real tasks.
Although synthetic images overcome the data scarcity issue, it remains unclear how the fine-tuning performance scales with pre-trained models.
We observe a simple and general scaling law that consistently describes learning curves in various tasks, models, and complexities of synthesized pre-training data.
arXiv Detail & Related papers (2021-08-25T02:29:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.