Learning to Emphasize: Dataset and Shared Task Models for Selecting
Emphasis in Presentation Slides
- URL: http://arxiv.org/abs/2101.03237v1
- Date: Sat, 2 Jan 2021 06:54:55 GMT
- Title: Learning to Emphasize: Dataset and Shared Task Models for Selecting
Emphasis in Presentation Slides
- Authors: Amirreza Shirani, Giai Tran, Hieu Trinh, Franck Dernoncourt, Nedim
Lipka, Paul Asente, Jose Echevarria, and Thamar Solorio
- Abstract summary: Emphasizing strong leading words in presentation slides can allow the audience to direct the eye to certain focal points instead of reading the entire slide.
Motivated by this demand, we study the problem of Emphasis Selection (ES) in presentation slides.
We introduce a new dataset containing presentation slides with a wide variety of topics, each is annotated with emphasis words in a crowdsourced setting.
- Score: 31.540208729354354
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Presentation slides have become a common addition to the teaching material.
Emphasizing strong leading words in presentation slides can allow the audience
to direct the eye to certain focal points instead of reading the entire slide,
retaining the attention to the speaker during the presentation. Despite a large
volume of studies on automatic slide generation, few studies have addressed the
automation of design assistance during the creation process. Motivated by this
demand, we study the problem of Emphasis Selection (ES) in presentation slides,
i.e., choosing candidates for emphasis, by introducing a new dataset containing
presentation slides with a wide variety of topics, each is annotated with
emphasis words in a crowdsourced setting. We evaluate a range of
state-of-the-art models on this novel dataset by organizing a shared task and
inviting multiple researchers to model emphasis in this new domain. We present
the main findings and compare the results of these models, and by examining the
challenges of the dataset, we provide different analysis components.
Related papers
- Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Prompting Large Language Models for Topic Modeling [10.31712610860913]
We propose PromptTopic, a novel topic modeling approach that harnesses the advanced language understanding of large language models (LLMs)
It involves extracting topics at the sentence level from individual documents, then aggregating and condensing these topics into a predefined quantity, ultimately providing coherent topics for texts of varying lengths.
We benchmark PromptTopic against the state-of-the-art baselines on three vastly diverse datasets, establishing its proficiency in discovering meaningful topics.
arXiv Detail & Related papers (2023-12-15T11:15:05Z) - Visual Analytics for Efficient Image Exploration and User-Guided Image
Captioning [35.47078178526536]
Recent advancements in pre-trained large-scale language-image models have ushered in a new era of visual comprehension.
This paper tackles two well-known issues within the realm of visual analytics: (1) the efficient exploration of large-scale image datasets and identification of potential data biases within them; (2) the evaluation of image captions and steering of their generation process.
arXiv Detail & Related papers (2023-11-02T06:21:35Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Exploring Effective Factors for Improving Visual In-Context Learning [56.14208975380607]
In-Context Learning (ICL) is to understand a new task via a few demonstrations (aka. prompt) and predict new inputs without tuning the models.
This paper shows that prompt selection and prompt fusion are two major factors that have a direct impact on the inference performance of visual context learning.
We propose a simple framework prompt-SelF for visual in-context learning.
arXiv Detail & Related papers (2023-04-10T17:59:04Z) - Topic-Selective Graph Network for Topic-Focused Summarization [0.0]
We propose a topic-arc recognition objective and topic-selective graph network.
First, the topic-arc recognition objective is used to model training, which endows the capability to discriminate topics for the model.
The topic-selective graph network can conduct topic-guided cross-interaction on sentences based on the results of topic-arc recognition.
arXiv Detail & Related papers (2023-02-25T15:56:06Z) - Multimodal Lecture Presentations Dataset: Understanding Multimodality in
Educational Slides [57.86931911522967]
We test the capabilities of machine learning models in multimodal understanding of educational content.
Our dataset contains aligned slides and spoken language, for 180+ hours of video and 9000+ slides, with 10 lecturers from various subjects.
We introduce PolyViLT, a multimodal transformer trained with a multi-instance learning loss that is more effective than current approaches.
arXiv Detail & Related papers (2022-08-17T05:30:18Z) - Topic Discovery via Latent Space Clustering of Pretrained Language Model
Representations [35.74225306947918]
We propose a joint latent space learning and clustering framework built upon PLM embeddings.
Our model effectively leverages the strong representation power and superb linguistic features brought by PLMs for topic discovery.
arXiv Detail & Related papers (2022-02-09T17:26:08Z) - ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads.
We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z) - Let Me Choose: From Verbal Context to Font Selection [50.293897197235296]
We aim to learn associations between visual attributes of fonts and the verbal context of the texts they are typically applied to.
We introduce a new dataset, containing examples of different topics in social media posts and ads, labeled through crowd-sourcing.
arXiv Detail & Related papers (2020-05-03T17:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.