DORIS-MAE: Scientific Document Retrieval using Multi-level Aspect-based
Queries
- URL: http://arxiv.org/abs/2310.04678v3
- Date: Sat, 28 Oct 2023 19:47:47 GMT
- Title: DORIS-MAE: Scientific Document Retrieval using Multi-level Aspect-based
Queries
- Authors: Jianyou Wang, Kaicheng Wang, Xiaoyue Wang, Prudhviraj Naidu, Leon
Bergen, Ramamohan Paturi
- Abstract summary: We propose a novel task, Scientific DOcument Retrieval using Multi-level Aspect-based quEries (DORIS-MAE)
For each complex query, we assembled a collection of 100 relevant documents and produced annotated relevance scores for ranking them.
Anno-GPT is a framework for validating the performance of Large Language Models (LLMs) on expert-level dataset annotation tasks.
- Score: 2.4816250611120547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In scientific research, the ability to effectively retrieve relevant
documents based on complex, multifaceted queries is critical. Existing
evaluation datasets for this task are limited, primarily due to the high cost
and effort required to annotate resources that effectively represent complex
queries. To address this, we propose a novel task, Scientific DOcument
Retrieval using Multi-level Aspect-based quEries (DORIS-MAE), which is designed
to handle the complex nature of user queries in scientific research. We
developed a benchmark dataset within the field of computer science, consisting
of 100 human-authored complex query cases. For each complex query, we assembled
a collection of 100 relevant documents and produced annotated relevance scores
for ranking them. Recognizing the significant labor of expert annotation, we
also introduce Anno-GPT, a scalable framework for validating the performance of
Large Language Models (LLMs) on expert-level dataset annotation tasks. LLM
annotation of the DORIS-MAE dataset resulted in a 500x reduction in cost,
without compromising quality. Furthermore, due to the multi-tiered structure of
these complex queries, the DORIS-MAE dataset can be extended to over 4,000
sub-query test cases without requiring additional annotation. We evaluated 17
recent retrieval methods on DORIS-MAE, observing notable performance drops
compared to traditional datasets. This highlights the need for better
approaches to handle complex, multifaceted queries in scientific research. Our
dataset and codebase are available at
https://github.com/Real-Doris-Mae/Doris-Mae-Dataset.
Related papers
- SciER: An Entity and Relation Extraction Dataset for Datasets, Methods, and Tasks in Scientific Documents [49.54155332262579]
We release a new entity and relation extraction dataset for entities related to datasets, methods, and tasks in scientific articles.
Our dataset contains 106 manually annotated full-text scientific publications with over 24k entities and 12k relations.
arXiv Detail & Related papers (2024-10-28T15:56:49Z) - RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs [12.846097618151951]
We develop a dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage.
We synthesize realistic user queries that integrate diverse topological structures, annotated information, and complex textual descriptions.
We introduce an enhanced Monte Carlo Tree Search (CTS) method, which automatically extracts relational path information from textual graphs for specific queries.
arXiv Detail & Related papers (2024-10-17T19:33:37Z) - CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation [51.2289822267563]
We propose Corpus Retrieval and Augmentation for Fine-Tuning (CRAFT), a method for generating synthetic datasets.
We use large-scale public web-crawled corpora and similarity-based document retrieval to find other relevant human-written documents.
We demonstrate that CRAFT can efficiently generate large-scale task-specific training datasets for four diverse tasks.
arXiv Detail & Related papers (2024-09-03T17:54:40Z) - BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
Many complex real-world queries require in-depth reasoning to identify relevant documents.
We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.
Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.
arXiv Detail & Related papers (2024-07-16T17:58:27Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Data Augmentation for Abstractive Query-Focused Multi-Document
Summarization [129.96147867496205]
We present two QMDS training datasets, which we construct using two data augmentation methods.
These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries.
We build end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets.
arXiv Detail & Related papers (2021-03-02T16:57:01Z) - QBSUM: a Large-Scale Query-Based Document Summarization Dataset from
Real-world Applications [20.507631900617817]
We present QBSUM, a high-quality large-scale dataset consisting of 49,000+ data samples for the task of Chinese query-based document summarization.
We also propose multiple unsupervised and supervised solutions to the task and demonstrate their high-speed inference and superior performance via both offline experiments and online A/B tests.
arXiv Detail & Related papers (2020-10-27T07:30:04Z) - AQuaMuSe: Automatically Generating Datasets for Query-Based
Multi-Document Summarization [17.098075160558576]
We propose a scalable approach called AQuaMuSe to automatically mine qMDS examples from question answering datasets and large document corpora.
We publicly release a specific instance of an AQuaMuSe dataset with 5,519 query-based summaries, each associated with an average of 6 input documents selected from an index of 355M documents from Common Crawl.
arXiv Detail & Related papers (2020-10-23T22:38:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.