Related papers: Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization

Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization

URL: http://arxiv.org/abs/2012.09355v1
Date: Thu, 17 Dec 2020 02:01:32 GMT
Title: Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization
Authors: Jiho Noh and Ramakanth Kavuluru
Abstract summary: We present a document reranking approach that combines neural query-document matching and text summarization. Evaluations using NIST's TREC-PM track datasets show that our model achieves state-of-the-art performance.
Score: 2.978663539080876
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Information retrieval (IR) for precision medicine (PM) often involves looking for multiple pieces of evidence that characterize a patient case. This typically includes at least the name of a condition and a genetic variation that applies to the patient. Other factors such as demographic attributes, comorbidities, and social determinants may also be pertinent. As such, the retrieval problem is often formulated as ad hoc search but with multiple facets (e.g., disease, mutation) that may need to be incorporated. In this paper, we present a document reranking approach that combines neural query-document matching and text summarization toward such retrieval scenarios. Our architecture builds on the basic BERT model with three specific components for reranking: (a). document-query matching (b). keyword extraction and (c). facet-conditioned abstractive summarization. The outcomes of (b) and (c) are used to essentially transform a candidate document into a concise summary that can be compared with the query at hand to compute a relevance score. Component (a) directly generates a matching score of a candidate document for a query. The full architecture benefits from the complementary potential of document-query matching and the novel document transformation approach based on summarization along PM facets. Evaluations using NIST's TREC-PM track datasets (2017--2019) show that our model achieves state-of-the-art performance. To foster reproducibility, our code is made available here: https://github.com/bionlproc/text-summ-for-doc-retrieval.

Related papers

Extracting Document Relations from Search Corpus by Marginalizing over User Queries [0.0]
We propose a novel framework that discovers document relationships through query marginalization.<n>Extracting Document Relations by Marginalizing over User queries is based on the insight that strongly related documents often co-occur in diverse user queries.<n>Our query-driven framework offers a practical approach to document organization that adapts to different user perspectives and information needs.
arXiv Detail & Related papers (2025-07-14T18:47:13Z)
PRISM: Fine-Grained Paper-to-Paper Retrieval with Multi-Aspect-Aware Query Optimization [61.783280234747394]
PRISM is a document-to-document retrieval method that introduces multiple, fine-grained representations for both the query and candidate papers.<n>We present SciFullBench, a novel benchmark in which the complete and segmented context of full papers for both queries and candidates is available.<n>Experiments show that PRISM improves performance by an average of 4.3% over existing retrieval baselines.
arXiv Detail & Related papers (2025-07-14T08:41:53Z)
Subtopic-aware View Sampling and Temporal Aggregation for Long-form Document Matching [34.81690842091582]
Long-form document matching aims to judge the relevance between two documents. We introduce a new framework to model representative matching signals. Our learning framework is effective on several document-matching tasks, including news duplication and legal case retrieval.
arXiv Detail & Related papers (2024-12-10T15:06:48Z)
Generative Retrieval Meets Multi-Graded Relevance [104.75244721442756]
We introduce a framework called GRaded Generative Retrieval (GR$2$) GR$2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training. Experiments on datasets with both multi-graded and binary relevance demonstrate the effectiveness of GR$2$.
arXiv Detail & Related papers (2024-09-27T02:55:53Z)
DiVA-DocRE: A Discriminative and Voice-Aware Paradigm for Document-Level Relation Extraction [0.3208888890455612]
We introduce a Discriminative and Voice Aware Paradigm DiVA. Our innovation lies in transforming DocRE into a discriminative task, where the model pays attention to each relation. Our experiments on the Re-DocRED and DocRED datasets demonstrate state-of-the-art results for the DocRTE task.
arXiv Detail & Related papers (2024-09-07T18:47:38Z)
Plot Retrieval as an Assessment of Abstract Semantic Association [131.58819293115124]
Text pairs in Plot Retrieval have less word overlap and more abstract semantic association. Plot Retrieval can be the benchmark for further research on the semantic association modeling ability of IR models.
arXiv Detail & Related papers (2023-11-03T02:02:43Z)
NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization [46.772517928718216]
We propose a summarize-then-simplify two-stage strategy, which we call NapSS. NapSS identifies the relevant content to simplify while ensuring that the original narrative flow is preserved. Our model achieves significantly better than the seq2seq baseline on an English medical corpus.
arXiv Detail & Related papers (2023-02-11T02:20:25Z)
Document-Level Relation Extraction with Sentences Importance Estimation and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences. We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss. Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z)
Mirror Matching: Document Matching Approach in Seed-driven Document Ranking for Medical Systematic Reviews [31.3220495275256]
Document ranking is an approach for assisting researchers by providing document rankings where relevant documents are ranked higher than irrelevant ones. We propose a document matching measure named Mirror Matching, which calculates matching scores between medical abstract texts by incorporating common writing patterns.
arXiv Detail & Related papers (2021-12-28T22:27:52Z)
CODER: An efficient framework for improving retrieval through COntextualized Document Embedding Reranking [11.635294568328625]
We present a framework for improving the performance of a wide class of retrieval models at minimal computational cost. It utilizes precomputed document representations extracted by a base dense retrieval method. It incurs a negligible computational overhead on top of any first-stage method at run time, allowing it to be easily combined with any state-of-the-art dense retrieval method.
arXiv Detail & Related papers (2021-12-16T10:25:26Z)
A Neural Model for Joint Document and Snippet Ranking in Question Answering for Large Document Collections [9.503056487990959]
We present an architecture for joint document and snippet ranking. The architecture is general and can be used with any neural text relevance ranker. Experiments on biomedical data from BIOASQ show that our joint models vastly outperform the pipelines in snippet retrieval.
arXiv Detail & Related papers (2021-06-16T16:04:19Z)
Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction [54.95848026576076]
We present an embarrassingly simple but effective method to select evidence sentences for document-level RE. We have released our code at https://github.com/AndrewZhe/Three-Sentences-Are-All-You-Need.
arXiv Detail & Related papers (2021-06-03T12:29:40Z)
Text Summarization with Latent Queries [60.468323530248945]
We introduce LaQSum, the first unified text summarization system that learns Latent Queries from documents for abstractive summarization with any existing query forms. Under a deep generative framework, our system jointly optimize a latent query model and a conditional language model, allowing users to plug-and-play queries of any type at test time. Our system robustly outperforms strong comparison systems across summarization benchmarks with different query types, document settings, and target domains.
arXiv Detail & Related papers (2021-05-31T21:14:58Z)
Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. We formulate the extractive summarization task as a semantic text matching problem. We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.