Learning to Emphasize: Dataset and Shared Task Models for Selecting
Emphasis in Presentation Slides
- URL: http://arxiv.org/abs/2101.03237v1
- Date: Sat, 2 Jan 2021 06:54:55 GMT
- Title: Learning to Emphasize: Dataset and Shared Task Models for Selecting
Emphasis in Presentation Slides
- Authors: Amirreza Shirani, Giai Tran, Hieu Trinh, Franck Dernoncourt, Nedim
Lipka, Paul Asente, Jose Echevarria, and Thamar Solorio
- Abstract summary: Emphasizing strong leading words in presentation slides can allow the audience to direct the eye to certain focal points instead of reading the entire slide.
Motivated by this demand, we study the problem of Emphasis Selection (ES) in presentation slides.
We introduce a new dataset containing presentation slides with a wide variety of topics, each is annotated with emphasis words in a crowdsourced setting.
- Score: 31.540208729354354
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Presentation slides have become a common addition to the teaching material.
Emphasizing strong leading words in presentation slides can allow the audience
to direct the eye to certain focal points instead of reading the entire slide,
retaining the attention to the speaker during the presentation. Despite a large
volume of studies on automatic slide generation, few studies have addressed the
automation of design assistance during the creation process. Motivated by this
demand, we study the problem of Emphasis Selection (ES) in presentation slides,
i.e., choosing candidates for emphasis, by introducing a new dataset containing
presentation slides with a wide variety of topics, each is annotated with
emphasis words in a crowdsourced setting. We evaluate a range of
state-of-the-art models on this novel dataset by organizing a shared task and
inviting multiple researchers to model emphasis in this new domain. We present
the main findings and compare the results of these models, and by examining the
challenges of the dataset, we provide different analysis components.
Related papers
- Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning [12.377363857246602]
We propose a novel model named MI-DELIGHT for short text classification.
It first performs multi-source information exploration to alleviate the sparsity issues.
Then, the graph learning approach is adopted to learn the representation of short texts.
arXiv Detail & Related papers (2025-01-16T00:26:15Z) - PASS: Presentation Automation for Slide Generation and Speech [0.0]
PASS is a pipeline used to generate slides from general Word documents.
It also automates the oral delivery of the generated slides.
Pass analyzes user documents to create a dynamic, engaging presentation with an AI-generated voice.
arXiv Detail & Related papers (2025-01-11T10:22:04Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Prompting Large Language Models for Topic Modeling [10.31712610860913]
We propose PromptTopic, a novel topic modeling approach that harnesses the advanced language understanding of large language models (LLMs)
It involves extracting topics at the sentence level from individual documents, then aggregating and condensing these topics into a predefined quantity, ultimately providing coherent topics for texts of varying lengths.
We benchmark PromptTopic against the state-of-the-art baselines on three vastly diverse datasets, establishing its proficiency in discovering meaningful topics.
arXiv Detail & Related papers (2023-12-15T11:15:05Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Exploring Effective Factors for Improving Visual In-Context Learning [56.14208975380607]
In-Context Learning (ICL) is to understand a new task via a few demonstrations (aka. prompt) and predict new inputs without tuning the models.
This paper shows that prompt selection and prompt fusion are two major factors that have a direct impact on the inference performance of visual context learning.
We propose a simple framework prompt-SelF for visual in-context learning.
arXiv Detail & Related papers (2023-04-10T17:59:04Z) - Multimodal Lecture Presentations Dataset: Understanding Multimodality in
Educational Slides [57.86931911522967]
We test the capabilities of machine learning models in multimodal understanding of educational content.
Our dataset contains aligned slides and spoken language, for 180+ hours of video and 9000+ slides, with 10 lecturers from various subjects.
We introduce PolyViLT, a multimodal transformer trained with a multi-instance learning loss that is more effective than current approaches.
arXiv Detail & Related papers (2022-08-17T05:30:18Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads.
We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z) - Let Me Choose: From Verbal Context to Font Selection [50.293897197235296]
We aim to learn associations between visual attributes of fonts and the verbal context of the texts they are typically applied to.
We introduce a new dataset, containing examples of different topics in social media posts and ads, labeled through crowd-sourcing.
arXiv Detail & Related papers (2020-05-03T17:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.