No More Manual Guides: Automatic and Scalable Generation of High-Quality Excel Tutorials
- URL: http://arxiv.org/abs/2509.21816v1
- Date: Fri, 26 Sep 2025 03:21:39 GMT
- Title: No More Manual Guides: Automatic and Scalable Generation of High-Quality Excel Tutorials
- Authors: Yuhang Xie, Jian Mu, Xiaojun Ma, Chaoyun Zhang, Lu Wang, Mengyu Zhou, Mugeng Liu, Si Qin, Qingwei Lin, Saravan Rajmohan, Shi Han, Dongmei Zhang,
- Abstract summary: Existing tutorials are manually authored by experts, require frequent updates after each software release, and incur substantial labor costs.<n>We present the first framework for automatically generating Excel tutorials directly from natural language task descriptions.<n>Our framework improves task execution success rates by 8.5% over state-of-the-art baselines.
- Score: 63.10037761131196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Excel is one of the most widely used productivity tools across domains, offering rich functionality but also overwhelming users with its complexity. This creates a persistent demand for tutorials to support effective usage. However, existing tutorials are manually authored by experts, require frequent updates after each software release, and incur substantial labor costs. Prior work has not achieved fully automated tutorial generation, since existing methods still depend on handcrafted operation sequences or example materials. In this paper, we present the first framework for automatically generating Excel tutorials directly from natural language task descriptions. Our framework first instantiates the task. Then a central component of this framework, Execution Agent, plans and executes the solution in Excel, and collects the intermediate artifacts required for tutorial construction. These artifacts are then transformed into both structured Excel documents and video demonstrations. To build a comprehensive tutorial corpus, we collected 1,559 task descriptions from real-world scenarios. In addition, we designed a systematic evaluation framework that integrates assessments from both large language models (LLMs) and human reviewers. Experimental results show that our framework improves task execution success rates by 8.5% over state-of-the-art baselines. Moreover, the generated tutorials demonstrate superior readability and instructional effectiveness, often approaching or surpassing expert-authored materials. Importantly, the automated pipeline eliminates manual labor and reduces time costs to 1/20 of expert authoring, making scalable and high-quality tutorial generation practical for the first time.
Related papers
- Self-Challenging Language Model Agents [98.62637336505242]
We propose the Self-Challenging framework for training an agent on high-quality tasks that are generated by itself.<n>The framework achieves over a two-fold improvement in Llama-3.1-8B-Instruct, despite using only self-generated training data.
arXiv Detail & Related papers (2025-06-02T14:23:33Z) - A Guide to Bayesian Networks Software Packages for Structure and Parameter Learning -- 2025 Edition [0.94371657253557]
We review the most relevant tools and software for BNs structural and parameter learning to date.<n>We provide an extensive easy-to-consult overview table summarizing all software packages and their main features.
arXiv Detail & Related papers (2025-03-21T10:36:11Z) - AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials [53.376263056033046]
Existing approaches rely on expensive human annotation, making them unsustainable at scale.<n>We propose AgentTrek, a scalable data synthesis pipeline that generates web agent trajectories by leveraging publicly available tutorials.<n>Our fully automated approach significantly reduces data collection costs, achieving a cost of just $0.55 per high-quality trajectory without human annotators.
arXiv Detail & Related papers (2024-12-12T18:59:27Z) - TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial
Creation on Physical Tasks [18.999028085376594]
TutoAI is a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks.
We distill common tutorial components by surveying existing work.
We present an approach to identify, assemble, and evaluate AI models for component extraction.
arXiv Detail & Related papers (2024-03-12T19:46:59Z) - SPROUT: an Interactive Authoring Tool for Generating Programming Tutorials with the Visualization of Large Language Models [19.885485760758783]
The rapid development of large language models (LLMs) has revolutionized the efficiency of creating programming tutorials.
We introduce a novel approach that breaks down the programming tutorial creation task into actionable steps.
We then present SPROUT, an authoring tool equipped with a series of interactive visualizations that empower users to have greater control and understanding of the programming tutorial creation process.
arXiv Detail & Related papers (2023-12-04T10:46:52Z) - Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals [69.76245723797368]
Read and Reward speeds up RL algorithms on Atari games by reading manuals released by the Atari game developers.
Various RL algorithms obtain significant improvement in performance and training speed when assisted by our design.
arXiv Detail & Related papers (2023-02-09T05:47:03Z) - Retrieval as Attention: End-to-end Learning of Retrieval and Reading
within a Single Transformer [80.50327229467993]
We show that a single model trained end-to-end can achieve both competitive retrieval and QA performance.
We show that end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-05T04:51:21Z) - EditEval: An Instruction-Based Benchmark for Text Improvements [73.5918084416016]
This work presents EditEval: An instruction-based, benchmark and evaluation suite for automatic evaluation of editing capabilities.
We evaluate several pre-trained models, which shows that InstructGPT and PEER perform the best, but that most baselines fall below the supervised SOTA.
Our analysis shows that commonly used metrics for editing tasks do not always correlate well, and that optimization for prompts with the highest performance does not necessarily entail the strongest robustness to different models.
arXiv Detail & Related papers (2022-09-27T12:26:05Z) - AxCell: Automatic Extraction of Results from Machine Learning Papers [44.15443359660737]
We present AxCell, an automatic machine learning pipeline for extracting results from papers.
When compared with existing methods, our approach significantly improves the state of the art for results extraction.
We show the viability of our approach enables it to be used for semi-automated results extraction in production.
arXiv Detail & Related papers (2020-04-29T17:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.