Towards Personalized and Human-in-the-Loop Document Summarization
- URL: http://arxiv.org/abs/2108.09443v1
- Date: Sat, 21 Aug 2021 05:34:46 GMT
- Title: Towards Personalized and Human-in-the-Loop Document Summarization
- Authors: Samira Ghodratnama
- Abstract summary: This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques.
It covers (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ubiquitous availability of computing devices and the widespread use of
the internet have generated a large amount of data continuously. Therefore, the
amount of available information on any given topic is far beyond humans'
processing capacity to properly process, causing what is known as information
overload. To efficiently cope with large amounts of information and generate
content with significant value to users, we require identifying, merging and
summarising information. Data summaries can help gather related information and
collect it into a shorter format that enables answering complicated questions,
gaining new insight and discovering conceptual boundaries.
This thesis focuses on three main challenges to alleviate information
overload using novel summarisation techniques. It further intends to facilitate
the analysis of documents to support personalised information extraction. This
thesis separates the research issues into four areas, covering (i) feature
engineering in document summarisation, (ii) traditional static and inflexible
summaries, (iii) traditional generic summarisation approaches, and (iv) the
need for reference summaries. We propose novel approaches to tackle these
challenges, by: i)enabling automatic intelligent feature engineering, ii)
enabling flexible and interactive summarisation, iii) utilising intelligent and
personalised summarisation approaches. The experimental results prove the
efficiency of the proposed approaches compared to other state-of-the-art
models. We further propose solutions to the information overload problem in
different domains through summarisation, covering network traffic data, health
data and business process data.
Related papers
- Converging Dimensions: Information Extraction and Summarization through Multisource, Multimodal, and Multilingual Fusion [0.0]
The paper proposes a novel approach to summarization that tackles such challenges by utilizing the strength of multiple sources.
The research progresses beyond conventional, unimodal sources such as text documents and integrates a more diverse range of data, including YouTube playlists, pre-prints, and Wikipedia pages.
arXiv Detail & Related papers (2024-06-19T17:15:47Z) - Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation [49.36436704082436]
How-to questions are integral to decision-making processes and require dynamic, step-by-step answers.
We propose Thread, a novel data organization paradigm aimed at enabling current systems to handle how-to questions more effectively.
arXiv Detail & Related papers (2024-06-19T09:14:41Z) - QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs [63.98556480088152]
Table summarization is a crucial task aimed at condensing information into concise and comprehensible textual summaries.
We propose a novel method to address these limitations by introducing query-focused multi-table summarization.
Our approach, which comprises a table serialization module, a summarization controller, and a large language model, generates query-dependent table summaries tailored to users' information needs.
arXiv Detail & Related papers (2024-05-08T15:05:55Z) - Label-Free Topic-Focused Summarization Using Query Augmentation [2.127049691404299]
This study introduces a novel method, Augmented-Query Summarization (AQS), for topic-focused summarization without the need for extensive labelled datasets.
Our method demonstrates the ability to generate relevant and accurate summaries, showing its potential as a cost-effective solution in data-rich environments.
This innovation paves the way for broader application and accessibility in the field of topic-focused summarization technology.
arXiv Detail & Related papers (2024-04-25T08:39:10Z) - Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction [104.29108668347727]
This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models.
The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies.
We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts.
arXiv Detail & Related papers (2023-07-03T16:01:45Z) - A Topic-aware Summarization Framework with Different Modal Side
Information [40.11141446039445]
We propose a general summarization framework, which can flexibly incorporate various modalities of side information.
We first propose a unified topic encoder, which jointly discovers latent topics from the document and various kinds of side information.
Results show that our model significantly surpasses strong baselines on three public single-modal or multi-modal benchmark summarization datasets.
arXiv Detail & Related papers (2023-05-19T08:09:45Z) - Making Science Simple: Corpora for the Lay Summarisation of Scientific
Literature [21.440724685950443]
We present two novel lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale)
We provide a thorough characterisation of our lay summaries, highlighting differing levels of readability and abstractiveness between datasets.
arXiv Detail & Related papers (2022-10-18T15:28:30Z) - Automatic Text Summarization Methods: A Comprehensive Review [1.6114012813668934]
This study provides a detailed analysis of text summarization concepts such as summarization approaches, techniques used, standard datasets, evaluation metrics and future scopes for research.
arXiv Detail & Related papers (2022-03-03T10:45:00Z) - Algorithmic Fairness Datasets: the Story so Far [68.45921483094705]
Data-driven algorithms are studied in diverse domains to support critical decisions, directly impacting people's well-being.
A growing community of researchers has been investigating the equity of existing algorithms and proposing novel ones, advancing the understanding of risks and opportunities of automated decision-making for historically disadvantaged populations.
Progress in fair Machine Learning hinges on data, which can be appropriately used only if adequately documented.
Unfortunately, the algorithmic fairness community suffers from a collective data documentation debt caused by a lack of information on specific resources (opacity) and scatteredness of available information (sparsity)
arXiv Detail & Related papers (2022-02-03T17:25:46Z) - Abstractive Query Focused Summarization with Query-Free Resources [60.468323530248945]
In this work, we consider the problem of leveraging only generic summarization resources to build an abstractive QFS system.
We propose Marge, a Masked ROUGE Regression framework composed of a novel unified representation for summaries and queries.
Despite learning from minimal supervision, our system achieves state-of-the-art results in the distantly supervised setting.
arXiv Detail & Related papers (2020-12-29T14:39:35Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.