Related papers: SciRepEval: A Multi-Format Benchmark for Scientific Document Representations

SciRepEval: A Multi-Format Benchmark for Scientific Document Representations

URL: http://arxiv.org/abs/2211.13308v4
Date: Mon, 13 Nov 2023 18:25:27 GMT
Title: SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
Authors: Amanpreet Singh, Mike D'Arcy, Arman Cohan, Doug Downey, Sergey Feldman
Abstract summary: We introduce SciRepEval, the first comprehensive benchmark for training and evaluating scientific document representations. We show how state-of-the-art models like SPECTER and SciNCL struggle to generalize across the task formats. A new approach that learns multiple embeddings per document, each tailored to a different format, can improve performance.
Score: 52.01865318382197
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learned representations of scientific documents can serve as valuable input features for downstream tasks without further fine-tuning. However, existing benchmarks for evaluating these representations fail to capture the diversity of relevant tasks. In response, we introduce SciRepEval, the first comprehensive benchmark for training and evaluating scientific document representations. It includes 24 challenging and realistic tasks, 8 of which are new, across four formats: classification, regression, ranking and search. We then use this benchmark to study and improve the generalization ability of scientific document representation models. We show how state-of-the-art models like SPECTER and SciNCL struggle to generalize across the task formats, and that simple multi-task training fails to improve them. However, a new approach that learns multiple embeddings per document, each tailored to a different format, can improve performance. We experiment with task-format-specific control codes and adapters and find they outperform the existing single-embedding state-of-the-art by over 2 points absolute. We release the resulting family of multi-format models, called SPECTER2, for the community to use and build on.

Related papers

DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning [7.036629164442979]
We introduce the DocTTT framework to address these challenges. Key innovation of our approach is that it uses test-time training to adapt the model to each specific input during testing. We propose a novel Meta-Auxiliary learning approach that combines Meta-learning and self-supervised Masked Autoencoder(MAE)
arXiv Detail & Related papers (2025-01-22T14:18:47Z)
MMSci: A Dataset for Graduate-Level Multi-Discipline Multimodal Scientific Understanding [59.41495657570397]
This dataset includes figures such as schematic diagrams, simulated images, macroscopic/microscopic photos, and experimental visualizations. We developed benchmarks for scientific figure captioning and multiple-choice questions, evaluating six proprietary and over ten open-source models. The dataset and benchmarks will be released to support further research.
arXiv Detail & Related papers (2024-07-06T00:40:53Z)
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models [63.466265039007816]
We present DocGenome, a structured document benchmark constructed by annotating 500K scientific documents from 153 disciplines in the arXiv open-access community. We conduct extensive experiments to demonstrate the advantages of DocGenome and objectively evaluate the performance of large models on our benchmark.
arXiv Detail & Related papers (2024-06-17T15:13:52Z)
Beyond Document Page Classification: Design, Datasets, and Challenges [32.94494070330065]
This paper highlights the need to bring document classification benchmarking closer to real-world applications. We identify the lack of public multi-page document classification datasets, formalize different classification tasks arising in application scenarios, and motivate the value of targeting efficient multi-page document representations.
arXiv Detail & Related papers (2023-08-24T16:16:47Z)
MIReAD: Simple Method for Learning High-quality Representations from Scientific Documents [77.34726150561087]
We propose MIReAD, a simple method that learns high-quality representations of scientific papers. We train MIReAD on more than 500,000 PubMed and arXiv abstracts across over 2,000 journal classes.
arXiv Detail & Related papers (2023-05-07T03:29:55Z)
Large-scale learning of generalised representations for speaker recognition [52.978310296712834]
We develop a speaker recognition model to be used in diverse scenarios. We investigate several new training data configurations combining a few existing datasets. We find that MFA-Conformer with the least inductive bias generalises the best.
arXiv Detail & Related papers (2022-10-20T03:08:18Z)
Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification. Our key idea is to represent each task using gradient information from a base model. Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z)
Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity [11.157086694203201]
We present a new scientific document similarity model based on matching fine-grained aspects. Our model is trained using co-citation contexts that describe related paper aspects as a novel form of textual supervision.
arXiv Detail & Related papers (2021-11-16T11:12:30Z)
SPECTER: Document-level Representation Learning using Citation-informed Transformers [51.048515757909215]
SPECTER generates document-level embedding of scientific documents based on pretraining a Transformer language model. We introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction to document classification and recommendation.
arXiv Detail & Related papers (2020-04-15T16:05:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.