Context-Based Quotation Recommendation
- URL: http://arxiv.org/abs/2005.08319v2
- Date: Wed, 19 Aug 2020 05:31:13 GMT
- Title: Context-Based Quotation Recommendation
- Authors: Ansel MacLaughlin, Tao Chen, Burcu Karagol Ayan, Dan Roth
- Abstract summary: We propose a novel context-aware quote recommendation system.
It generates a ranked list of quotable paragraphs and spans of tokens from a given source document.
We conduct experiments on a collection of speech transcripts and associated news articles.
- Score: 60.93257124507105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While composing a new document, anything from a news article to an email or
essay, authors often utilize direct quotes from a variety of sources. Although
an author may know what point they would like to make, selecting an appropriate
quote for the specific context may be time-consuming and difficult. We
therefore propose a novel context-aware quote recommendation system which
utilizes the content an author has already written to generate a ranked list of
quotable paragraphs and spans of tokens from a given source document.
We approach quote recommendation as a variant of open-domain question
answering and adapt the state-of-the-art BERT-based methods from open-QA to our
task. We conduct experiments on a collection of speech transcripts and
associated news articles, evaluating models' paragraph ranking and span
prediction performances. Our experiments confirm the strong performance of
BERT-based methods on this task, which outperform bag-of-words and neural
ranking baselines by more than 30% relative across all ranking metrics.
Qualitative analyses show the difficulty of the paragraph and span
recommendation tasks and confirm the quotability of the best BERT model's
predictions, even if they are not the true selected quotes from the original
news articles.
Related papers
- ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation.
Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level.
We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z) - Multi-Layer Ranking with Large Language Models for News Source Recommendation [20.069181633869093]
We build a novel dataset, called NewsQuote, consisting of 23,571 quote-speaker pairs sourced from a collection of news articles.
We formulate the recommendation task as the retrieval of experts based on their likelihood of being associated with a given query.
Our results show that employing an in-context learning based LLM ranker and a multi-layer ranking-based filter significantly improve both the predictive quality and behavioural quality of the recommender system.
arXiv Detail & Related papers (2024-06-17T17:02:34Z) - Verifiable Generation with Subsentence-Level Fine-Grained Citations [13.931548733211436]
Verifiable generation requires large language models to cite source documents supporting their outputs.
Previous work mainly targets the generation of sentence-level citations, lacking specificity about which parts of a sentence are backed by the cited sources.
This work studies verifiable generation with subsentence-level fine-grained citations for more precise location of generated content supported by the cited sources.
arXiv Detail & Related papers (2024-06-10T09:32:37Z) - CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Recommending Multiple Positive Citations for Manuscript via
Content-Dependent Modeling and Multi-Positive Triplet [6.7854900381386845]
We propose a novel scientific paper modeling for citation recommendations, namely Multi-Positive BERT Model for Citation Recommendation (MP-BERT4CR)
The proposed approach has the following advantages: First, the proposed multi-positive objectives are effective to recommend multiple positive candidates.
MP-BERT4CR are also effective in retrieving the full list of co-citations, and historically low-frequent co-citation pairs compared with the prior works.
arXiv Detail & Related papers (2021-11-25T04:09:31Z) - MDERank: A Masked Document Embedding Rank Approach for Unsupervised
Keyphrase Extraction [41.941098507759015]
Keyphrases are phrases in a document providing a concise summary of core content, helping readers to understand what the article is talking about in a minute.
We propose a novel unsupervised keyword extraction method by leveraging the BERT-based model to select and rank candidate keyphrases with a MASK strategy.
arXiv Detail & Related papers (2021-10-13T11:29:17Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z) - Learning Neural Textual Representations for Citation Recommendation [7.227232362460348]
We propose a novel approach to citation recommendation using a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function.
To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation.
arXiv Detail & Related papers (2020-07-08T12:38:50Z) - SPECTER: Document-level Representation Learning using Citation-informed
Transformers [51.048515757909215]
SPECTER generates document-level embedding of scientific documents based on pretraining a Transformer language model.
We introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction to document classification and recommendation.
arXiv Detail & Related papers (2020-04-15T16:05:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.